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SCORE Search Results Details for Application 
09821877 and Search Result us-09-821-877-l.rge. 

Score Home Retrieve Application SCORE System SCORE Comments / 

Page List Overview FAQ Sug gestions 

This page gives you Search Results detail for the Application 09821877 and Search Result us-09-821- 

877-1. rge. 

start 

Go Back to previous page 



GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Bioccelerat ion Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



July 22, 2006, 03:40:49 ; Search time 10969 Seconds 

(without alignments) 
6885.031 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-821-877-1 
1181 

1 atggggcagaatctttccac tacatttaaaccctaataaa 1181 



Scoring table: I DENT I T Y_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



6366136 seqs, 31973710525 residues 



Total number of hits satisfying chosen parameters: 



12732272 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : GenEmbl : * 



1 




gb_env : * 


2 




gb_pat : * 


3 




gb_ph : * 


4 




gb_pl : * 


5 




gb_pr : * 


6 




gb_ro : * 


7 




gb_sts : * 


8 




gb_sy : * 


9 




gb_un : * 


10 


gb_vi : * 


11 


gb_ov : * 


12 


gb_htg : * 


13 


gb_in : * 


14 


gb_om : * 


15 


gbjba : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 
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Result Query 



No. 


Score 


Match Length DB 


ID 


Description 


1 


1136.2 




. z 


1 1 Q/" 

XXOO 


1 A 

xu 


AY230122 


AY230122 Hepatitis 


2 


1133.2 




a 


i i nrt 

± 1 / u 


1 A 

XU 


AB104717 


AB104717 Hepatitis 


3 


1133 


Q C 


Q 

. y 


OCT O 


Z 


E00007 


E00007 DNA coding 


4 


1133 




. y 


6 /l J 


o 
z 


A08967 


A08967 Hepatitis B 


5 


1133 




. y 


Z / fi J 


X U 


HPBADYW 


JO 22 02 human hepat 


6 


1131.6 




Q 

. o 


i i 7n 
11 / u 


x u 


AB104723 


AB104723 Hepatitis 


7 


1128.4 




. b 


x x /u 


T A 

X U 


AB104715 


AB104715 Hepatitis 


8 


1128.2 


yb . 


. b 


QAAT 

o UU / 


z 


BD181818 


BD181818 Hepatitis 


9 


1126.8 


yo . 


A 

> 4 


11 / U 


T A 

X u 


AY603461 


AY603461 Hepatitis 


10 


1126.6 


yb . 


A 

. 4 


"3 o n "7 
JzU / 


o 
o 


DQ219811 


DQ2 19811 Synthetic 


11 


1125.2 


yb . 


-5 


X X / u 


1 A 

X u 


AB104722 


AB104722 Hepatitis 


12 


1125.2 


95 . 


. J 


1 1 /u 


T A 

XU 


AY576427 


AY576427 Hepatitis 


13 


1125.2 


yb . 


. -5 


X X / u 


1 A 

X u 


AY603460 


AY603460 Hepatitis 


14 


1125.2 


yb . 


. J 


11 / u 


X u 


AY603464 


AY603464 Hepatitis 


15 


1125 


Q C 

yb . 


. J 


llOO 


X u 


AY230128 


AY230128 Hepatitis 


16 


1123 


yb . 


. X 


11/1 


X u 


DQ131119 


DQ131119 Hepatitis 


17 


1122 


yb , 


A 
. U 


i inn 

1 1 / U 


1 A 

X u 


AY603456 


AY603456 Hepatitis 


18 


1122 


yb , 


A 

. u 


I 1 1 a 

II / U 


1 A 

X u 


AY603459 


AY603459 Hepatitis 


19 


1121.8 


Q t: 

yb . 


. u 


iiy j 


x u 


AY230125 


AY230125 Hepatitis 


20 


1121.8 


yb . 


A 


jib / 


1 A 

XU 


AY236162 


AY236162 Hepatitis 


21 


1121.8 


yb . 


. U 


"5 1 O "7 

Jlo / 


1 A 

XU 


DQ336678 


DQ336678 Hepatitis 


22 


1120.4 


94 , 


. y 


1 1 '"7 A 

I I / U 


T A 

X u 


AY603451 


AY603451 Hepatitis 


23 


1120.4 


94 . 


. 9 


1170 


T A 
XU 


AY603457 


AY603457 Hepatitis 


24 


1120.2 


Q A 

y4 , 


. y 


JXo/ 


1 A 

X u 


DQ304548 


DQ304548 Hepatitis 


25 


1120.2 


y4 , 


. y 


JXo / 


T A 
XU 


DQ336675 


DQ336675 Hepatitis 


26 


1120.2 


Q A 

y4 . 


. y 


jlo / 


T A 
XU 


DQ336679 


DQ336679 Hepatitis 


27 


1118.8 


y4 , 


. 7 


XX /u 


T A 
XU 


AB104719 


AB104719 Hepatitis 


28 


1118.8 


y^ . 


. / 


11 / U 


1 A 

XU 


AY603450 


AY603450 Hepatitis 


29 


1118.8 


y^ . 


. / 


11 / U 


XU 


AY603454 


AY603454 Hepatitis 


30 


1118.8 


y4 




inn 

XX /u 


1 A 

XU 


AY603462 


AY603462 Hepatitis 


31 


1118.6 


y4 . 


. / 


Jlo / 


1 A 

XU 


DQ304549 


DQ304549 Hepatitis 


32 


1118.6 


94 


. / 


JXo / 


1 A 

XU 


DQ304551 


DQ304551 Hepatitis 


33 


1118.6 


O A 

y4 , 


. / 


JXo/ 


1 A 

XU 


DQ336674 


DQ336674 Hepatitis 


34 


1117.2 


O A 

y4 . 


c. 

. D 


"1 1 "7 A 

XX /u 


XU 


AY603465 


AY603465 Hepatitis 


35 


1117.2 


94 


. D 


12 01 


Z 


AR011346 


AR011346 Sequence 


36 


1117.2 


94 - 


. 6 


12 01 


Z 


117984 


117984 Sequence 21 


37 


1117.2 


94 - 


. 6 


X285 


z 


AR011345 


AR011345 Sequence 


38 


1117.2 




. o 


1 OQC 
l^OD 


z 


117983 


117983 Sequence 21 


39 


1117.2 


94. 


.6 


2342 


2 


A32618 


A32618 Synthetic c 


40 


1117 


94 , 


.6 


2637 


10 


AY902771 


AY902771 Hepatitis 


41 


1117 


94 , 


.6 


3187 


10 


DQ304547 


DQ304547 Hepatitis 


42 


1117 


94 . 


.6 


3187 


10 


DQ304550 


DQ304550 Hepatitis 


43 


1117 


94 . 


.6 


3187 


10 


DQ336676 


DQ336676 Hepatitis 


44 


1115.6 


94 . 


.5 


1170 


10 


AB074844 


AB074844 Hepatitis 


45 


1115.4 


94. 


.4 


3187 


10 


DQ336677 


DQ336677 Hepatitis 



ALIGNMENTS 



RESULT 1 
AY230122 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



AY230122 1186 bp 

Hepatitis B virus isolate case 18 
middle surface protein, and small 
cds . 

AY230122 

AY230122.1 GI: 38374274 
Hepatitis B virus 



DNA linear VRL 23-NOV-2003 

non- tumor large surface protein, 
surface protein genes, complete 
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ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



misc feature 



misc feature 



misc feature 



Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae ; 
Orthohepadnavirus . 

1 (bases 1 to 1186) 

Raimondo,G., Pollicino,T. and Raffa,G. 
Occult HBV in liver cancer 
Unpublished 

2 (bases 1 to 1186) 

Raimondo, G. , Pollicino,T. and Raffa,G. 
Direct Submission 

Submitted ( 04 -FEB-2003 ) Internal Medicine, University of Messina, 
via Consolare Valeria, Messina' 98124 , Italy 

Location/Qualifiers 

1. .1186 

/organism= "Hepatitis B virus" 
/virion 

/mol_type= "genomic DNA" 
/isolate="case 18 non-tumor" 
/db_xref = " t axon : 1 04 07 " 
1. .1170 

/note= "similar to large surface protein" 
325. .1170 

/note=" similar to middle surface protein" 
490. .1170 

/note= n similar to small surface protein" 



ORIGIN 



Query Match 96.2%; 
Best Local Similarity 97.6%; 
Matches 1153; Conservative 



Score 1136.2; 
Pred . No . 0 ; 
0; Mismatches 



DB 10; Length 1186; 



28; Indels 



0; Gaps 



0; 



Qy 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy. 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 



ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 6 0 

MIIIIIIMIIIIIIIIIIMIIMIIIIIIMII llllllllllllllllllllllll 

ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 6 0 
CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 12 0 

1 1! II II II 1 1 1 Ml il II 1 1 II II II 1 1 II 1 1 II 1 1 1 1 1 1 II II 1 1 1 1 1 1 II I 1 1 

CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 120 
TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 180 

I II II II II II MM II I MM II I II III Mill II Mill MUM II II III II 

TGGCCAGACGCCAACAAGGTAGGAGGTGGAGCATTCGGGCTGGGATTCACCCCACCGCAC 180 

GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 240 

II Ml I I III I I II I M II II I I I II III I II II II II! I I I I I 1 I I f I I I I I I I 1 I I 
GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACCTTGCCAGCAAAT 240 

CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 300 

IMIIIMIM II I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 IMIIIIIIIII 

CCGCCTCCTGCCTCTACCAATCGCCAGTCAGGAAGGCAGCCTACCCCTCTGTCTCCACCT 300 

TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 360 

I II II II II II I II I I II I I II II I I II II I I I I I I I I I I I I I II I I I I I I I I II I II 
TTGAGAAACACTCATCCTCAGGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 360 

CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 42 0 

I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I II I I I ' M 
CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 420 

AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

I II II II II II MM 1 1: 1 1 MM II INI II III 1 1 MM I II II 1 1 1 1 II 1 1 MM II 

AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 



4 81 GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 540 
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nK 

UD 


API 


wy 


J *x X 


UD 


Oft J. 






UD 


QUI 


wy 


661 


UD 


OOl 


uy 


791 


UD 


*7 O 1 

/ Z X 


yy 


/Ol 


nK 
UD 


*7 Q T 
/Ol 


wy 


ft 4 1 

O *± -L 


nK 


o*±x 


yy 


i7 U X 


nK 
UD 


q m 
y ux 


yy 


7D1 


nK 
UD 


y d x 


uy 




nK 

UD 




uy 




Db 


1081 


Qy 


1141 


Db 


1141 



1 1 1 1 IMIIIIIM Mill MUM Mil M M II II MIMMMIM MM MM M 

GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 54 0 
GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

Ml II II MM M 1 1 II II M I II II II II II 1 1 1 1 II II I Ml II II II M II M MM 

GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 
TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 66 0 

Ml II MM Ml I MM III II II MM II II M II II II Ml I II II II I MINIM 

TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGACCCCAACC 660 
661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 

It III III II I II III I II I II I II I MM 1 1 1 1 II 1 1 Ml II II III Ml II 1 1 II M 

661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 
CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

II III Mill I II llllll Mill llllll II III Ml llllll II 1 1 1 1 1 1 1 1 - Mill 

721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 
CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

II III Mill I II III III II Ml III III Mill Ml III III II IMIIMI I Mill 

CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 
ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

1 1 1 II II 1 1 MM 1 1 1 1 1 II M 1 1 II 1 1 M 1 1 1 1 1 II I M M M I II II M I M II 1 1 1 

ACGGGACCATGCAGAACCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 
TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

IMIIMI IMIMMMI Mill II Ml Ml I II II I II M 1 1 1 II 1 1 1 llllll 

TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCTTGGGCT 960 
TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 102 0 

I II 1 1 1 II I II 1 1 II II I III M I II 1 1 II II 1 1 1 II I II 1 1 II II II M M II M I III 

TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 
CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I II 1 1 1 Ml 1 1 II II III I II M II I II M 1 1 1 II II I I M II 1 1 llllll 1 1 M MM 

CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTCTGGCTTTCAGTTATATGGATG 1080 
ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM III MMMMIMIMM II M I II I III 1 1 M 1 1 1 1 II II II II I MM 

ATGTGGTATTGGGGGCCAAGTCTGTCCAGCATCTTGAGTCCCTTTTTACCGCTGTAACCA 1140 
ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

lllllll II MMMIIMMIMM IMIMI Ml 

ATTTTCTTTTGGCTTTGGGTATACATTTAAACCCTAACAAA 1181 



RESULT 2 
AB104717 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



AB104717 1170 bp DNA linear VRL 26-JUN-2003 

Hepatitis B virus s gene for pre-S and S protein, complete cds, 
isolate: EG80. 
AB104717 

AB104717. 1 GI: 32261194 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae ; 

Orthohepadnavirus . 

1 

Saudy,N., Sugauchi,F., Tanaka,Y., Suzuki, S., Aal,A.A., Zaid,M.A., 
Agha,S. and Mizokami,M. 

Genotypes and phylogenetic characterization of hepatitis B and 
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JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



delta viruses in Egypt 

J. Med. Virol. 70 (4), 529-536 (2003) 

12794714 

2 (bases 1 to 1170) 

Suzuki, S., Saudy,N., Sugauchi,F., Orito,E., Agha,S. and Mizokami,M. 
Direct Submission 

Submitted (26-FEB-2003 ) Seiji Suzuki, Nagoya City University 
Graduate School, Department of clinical Molecular Informative 
Medicine; Mizuho, Nagoya, Aichi 467-8601, Japan 
(E-mail : sei jis@med. nagoya - cu. ac . jp, Tel : 81-52-853-8292, 
Fax:81-52-842-0021) 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 

/mol_type= "genomic DNA" 

/isolate="EG80 u 

/ db_xr e f = " t axon : 1 0 4 0 7 " 

/ count ry= " Egypt " 

/note="genotype D" 

1. .1170 

/gene= "s " 

1. .1170 

/gene="s" 

/codon_start=l 

/product="pre-S and S protein" 
/protein_id= "BAC78512 . 1" 
/db_xref="GI : 32261195" 

/ 1 rans 1 a t ion= " MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGF T P PHGGLLGWS PQAQG ILQTL PANP P PAS TNRQSGRQPT PLSPPLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTVSPISSIFSRIGDP 
ALNMENI TSGFLGPLLVLQAGFFLLTRI LT I PQSLDS WWTSLNFLGGTT VCLGQNSQS 
PTSNHS PTSCPPTCPGYRWMCLRRF 1 1 FLF I LLLCLI FLLVLLDYQGML P VC PL I PGS 
STTSTGPCRTCTTPAQGTSMYPSCCCTKPLDGNCTCI PI PSSWAFGKFLWEWASARFS 
WLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 



ORIGIN 



Query Match 96.0%; 
Best Local Similarity 98.0%; 
Matches 1147; Conservative 



Score 1133.2; 
Pred. No. 0; 
0; Mismatches 



DB 10; Length 1170; 



23; Indels 



0; Gaps 



0; 



Qy 


1 


Db 


l 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 



ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 6 0 

I II II I I I I I I I I I II I I I I II II I I I I I I II I II I I I I I I I I I I I I I I I I I I I II I I 
ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 6 0 

CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 12 0 

I II II 1 1 II II I II 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 II 1 1 

CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 12 0 

TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 180 

I II II II II II I I I I I I II II II II II' III II I II II Mill I I I I I I I I I I I II I I 
TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGATTCACCCCACCGCAC 180 

GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 240 

I II II MM INI 1 1 II II I 1 1 1 1 1 II II I III II II I II II III I II 1 1 II II II 

GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACCTTGCCAGCAAAT 24 0 
CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 3 00 

MMIMMIMM 1 1 1 1 i 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 llllllllllll 

CCGCCTCCTGCTTCTACCAATCGCCAGTCAGGAAGGCAGCCTACCCCTCTGTCTCCACCT 3 00 
TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 360 

1 1 1 IM II II 1 1 1 1 II 1 1 1 1 1 II II II 1 1 1 1 1 1 1 1 M M 1 1 lllllllllllllll 

CTGAGAAACACTCATCCTCAGGCCATGCAGTGGAACTCCACAACCTTCCACCAAACTCTG 360 
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Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 



361 CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 420 

I II 1 1 1 II! 11 M 1 1 II II I Mill II III III II II II I II I II II II I III II Ml I 

361 CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 420 
421 AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

I II 1 1 1 III 1 1 M 1 1 M II 1 1 MM M MM 1 1 1 II 1 1 II I IMM II II II 1 1 II II 1 1 

421 AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 



481 



601 



601 



540 



GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 

III I Ml 1 1 II M II IIMIMM MM M II II II I Ml I III II M II II M II II 

481 GCGTTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 540 
541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

I II II I II 1 1 1 M II M MM MM I MM II II II II I Ml 1 1 II II II II I II II II 

541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 



TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 

I II II I II 1 1 1 M II 1 1 II M II 1 1 1 1 III 1 1 II MM 1 1 1 1 1 1 II II II II I II II II 

TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 



661 



660 



660 



720 



TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 

I II III 1 1 1 1 1 M 1 1 II II 1 1 1 Ml I Mill II 1 1 1 1 M Ml I M II II 1 1 1 1 II II 1 1 

661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 



721 



780 



CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 

I II 1 1 1 1 II II 1 1 1 1 II II 1 1 MIMI I M II I II M II I II 1 1 II Ml 1 1 1 II II II 

721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 
781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 84 0 

1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 MIMI I II I II 1 1 M II II II 1 1 Ml Ml M 1 1 1 II II II 

781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 



841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 

1 1 1 1 III 1 1 II 1 1 1 1 1 1 M MIMI I II 1 1 M II II 1 1 II II 1 1 II 1 1 1 1 1 1 II III 

841 ACGGGACCATGCAGAACCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 



901 



901 



961 



961 



TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 

MINIM lllllll Ml Mill II II M II 1 1 1 1 II 1 1 1 1 1 1 II I III II II II 

TGCTGTACCAAACCTTTGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 



900 



900 



960 



960 



TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 102 0 

1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 M III II II II II II M 1 1 II M II II 1 1 II 1 1 II II M II 

TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 



1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I I II I I I M I I I II I I M I M I II I I I III II I I I I M II II I I I II I III I I III I I II 
1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM III 1 1 1 1 1 1 1 1 1 II 1 1! II II I IM I II II II II II 1 1 II 1 1 1 1 1 1 1 1 III 

1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

I I I II II I I I I II I I I I II II III II II II 
1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 



RESULT 3 

E00007 

LOCUS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 



E00007 2612 bp 

DNA coding of HBV antigen. 
E00007 

E00007. 1 GI :2168318 
JP 1980104887-A/l. 



DNA 



linear PAT 29-SEP-1997 
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SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; 

Hominidae; Homo. 

1 (bases 1 to 2612) 

Kenesu,M. and Haintsu,E.S. 

REARRANGED DNA MOLECULE AND METHOD 

Patent: JP 1980104887-A 1 ll-AUG-1980; 

BIOGEN NV 

OS human 

PN JP 1980104887-A/l 

PD ll-AUG-1980 

PF 20-DEC-1979 JP 1979164945 

PR 22-DEC-1978 GB 78 49907, 27-DEC-1978 GB 78 50039, PR 

01-NOV-1979 GB 79 7937910 

PI KENESU MAREE, HAINTSU ERUNSUTO SHIYARAA 

PC C12N15/00,C07H2l/04,C12Nl/00, C12P19/34, C12P21/02, C12Q1/00, PC 
C12Q1/70// 

PC C12Rl/125,C12Rl/l9,C12Rl/38,C12Rl/645; 
strandedness : Double; 
topology : Linear ; 
hypothetical: No; 
ant i- sense: No; 

♦source : tissue_type=leukocyte ; 



CC 

cc 

CC 

cc 
cc 

FH 
FH 
FT 
FT 
FT 

mat_peptide 
FT 

1. .639 



Key 

sig_peptide 
mat_peptide 



1524 



Location/Qualifiers 

1. .87 
88. .636 

/product= *HBV nucleus antigen 1 
. .2201 

/product= ' HBV surface antigen 1 



FT 



FT 



CDS 



FT 



/product= 'HBV nucleus antigen' FT CDS 



1524. 



.2204 



FT 



FEATURES 

source 



ORIGIN 



/product^ ' HBV surface antigen'. 
Location/Qualifiers 
1. .2612 

/organism="Homo sapiens" 
/mol_type= "genomic DNA" 
/ db_xr e f = " t axon : 9 6 0 6 " 



Query Match 95.9%; 
Best Local Similarity 97.5%; 
Matches 1151; Conservative 



Score 1133; DB 2; 
Pred. No. 0; 
0; Mismatches 30; 



Length 2612 ; 



Indels 



0; Gaps 



0; 



Qy 


i 


Db 


1035 


Qy 


61 


Db 


1095 


Qy 


121 


Db 


1155 


Qy 


181 


Db 


1215 


Qy 


241 



ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 6 0 

1 1 IN III 1 1 II II 1 1 1 II I II I II III I II II 1 1 1 II I II II II II II 1 1 1 M I II II 

ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 1094 
CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 12 0 

1 1 II I 1 1 M I I II 1 1 I 1 1 1 1 1 II I II I II I II !l I 1 1 1 1 1 II I I 1 1 1 1 1 1 1 1 1 1 I Ml II 

CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 1154 

TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 180 

I I I II II II I II I I I I I I II : II II I I I I I I M I I II || I I I I I II I II I I I I I I I I 
TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTAGGGTTCACCCCACCGCAC 1214 

GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 24 0 

II I II I II I II II II II 1 1 II II I II II II II Ml I II II I M II 1 1 1 1 1 1 1 M I II I 

GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAATGCAAACCTTGCCAGCAAAT 1274 



241 CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 3 00 
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1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 ! 1 1 1 II 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

Db 1275 CCGCCTCCTGCCTCTACCAATCGCCAGTCAGGACGGCAGCCTACCCCGCTGTCTCCACCT 1334 

Qy 3 01 TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 360 

MINI MINIMUM MINIM 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1335 CTGAGAACCACTCATCCTCAGGCCATGCACTGGAACTCCACAACCTTCCACCAAACTCTG 13 94 

Qy 3 61 CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 42 0 

1 1 1 M II I Ml M 1 1 M 111 M 1 1 1 M M 1 1 M 1 1 II I M 1 1 M 1 1 1 1 M 1 1 IMIM 

Db 13 95 CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGGACAGTA 14 54 

Qy 421 AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

M I M II 1 1 1 1 1 M I M I I MIMM MIMM I M I M I Mill MM II M III 

Db 1455 AACCCTGTTCCGACTACTACCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 1514 

Qy 481 GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 54 0 

MM II II II M M II I Ml I M I Mill M II III MM I II M M II M II M MM 

Db 1515 GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 1574 

Qy 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

I I I I I I I M II I I I I I II I I I II I I I II II II II I I I I II I II I II III I I Ml II Ml I 

Db 1575 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 1634 

Qy 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

M M M I Ml I II 1 1 II II I I II 1 1 II 1 1 III 1 1 1 1 1 1 Ml I III I II II II II I I 

Db 1635 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAATC 1694 

Qy 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 720 

II II 1 1 II II II 1 1 II II II III 1 1 II 1 1 Mill 1 1 1 1 Ml 1 1 II III I Ml I II 1 1 II 

Db 1695 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 1754 

Qy 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

Ml II I III II M II 1 1 II 1 1 II II I II II 1 1 II 1 1 II I M I II M II II 1 1 M M MM 

Db 1755 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 1814 

Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

1 1 1 II 1 1 1 M I II II M II II II MIMM 1 1 II III I Ml 1 1 II 1 1 1 1 ] 1 1 1 1 1 1 1 1 1 

Db 1815 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCATCAACCACCAGC 1874 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

IMIM I IMIM II 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 II I M I II II 1 1 II I M I II MM 

Db 1875 ACGGGATCCTGCAGAACCTGCACGACTCCTGCTCAAGGAATCTCTATGTATCCCTCCTGT 1934 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

II III 1 1 Ml I II 1 1 1 1 II I II II I II II III I II II 1 1 1 II I II II II 1 1 II 1 1 III 

Db 1935 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 1994 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 102 0 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 [ 1 1 Ml I II 1 1 Ml I MM 

Db 1995 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCTTGGCTCAGTTTACTAGTG 2054 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I III I I M I M I I I I II I 11 I II II II I II II I II I I I I I I I I I I II I I I I I I II I M 
Db 2 055 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCATTGTTTGGCTTTCAGTTATATGGATG 2114 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM III MMIIIMIMMIMM I II 1 1 II II I II 1 1 II 1 1 1 M M 1 1 1 1 1 II 

Db 2115 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 2174 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 i 1 1 1 1 1 1 E 1 1 1 1 1 III 

Db 2175 ATTTTCTTTTGTCTTTGGGCATACATTTAAACCCTAACAAA 2215 
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RESULT 4 

A08967 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



CDS 



CDS 



A08967 2743 bp DNA linear PAT 01-SEP-1993 

Hepatitis B virus genes for core antigen and surface antigen. 
A08967 

A08967.1 GI:411872 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses,* Hepadnaviridae; 
Orthohepadnavirus . 
1 (bases 1 to 2743) 
Murray, K. and Schaller , H . E . 

Recombinant DNA molecules and their method of production 
Patent: EP 0374869-A 5 27-JUN-1990; 
Biogen, Inc.; BIOGEN, INC 

Location/Qualifiers 

1. .2743 

/organism= "Hepatitis B virus" 
/mol_type="unas signed DNA" 
/db_xref = " taxon : 104 07 » 
1. .84 

/note= "Protein sequence is in conflict with the conceptual 

translation" 

/codon_start=l 

/product = " HBcAg " 

/protein_id- "CAA008 15 . 1 " 

/db_xref="GI : 1334788" 

/translation "GGLFHLCLIISCSCPTVQASKLCLGWL" 

88. .639 

/codon_start=l 

/product =" core antigen" 

/protein_id= "CAA00816 . 1 " 

/db_xref="GI : 411874" 

/db_xref = " GOA : P03 14 7 " 

/db_xref ="UniProtKB/Swiss-Prot :P03147" 

/translation^ "MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTAAALYRDALES 

PEHCSPHHTALRQAILCWGDLMTLATWVGTNLEDPASRDLWSYVNTNVGLKFRQLLW 

FHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTWRRRGRSPRRRT 

PSPRRRRSQSPRRRRSQSRESQC" 

1524. .2204 

/ codon_start=l 

/product =" surf ace antigen" 

/protein_id= "CAA00817 . 1 " 

/db_xref = n GI :411875" 

/ 1 rans lat i on= » MENITSGFLGPLLVLQAGFFLLTRI LT I PQSLDSWWTSLNFLGG 
TTVCLGQNSQSPISNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQG 
MLPVCPLIPGSSTTSTGSCRTCTTPAQGI'SMYPSCCCTKPSDGNCTCIPIPSSWAFGK 
FLWEWASARFSWLSLLVPFVQWFVGLSPIVWLSVIWMMWYWGPSLYSILSPFLPLLPI 
FFCLWAYI" 



ORIGIN 



Query Match 95.9%; 
Best Local Similarity 97.5%; 
Matches 1151; Conservative 



Score 1133; DB 2; Length 2743; 
Pred. No. 0; 
0; Mismatches 30; Indels 0; Gaps 



0; 



Qy 1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 

i II M II II ii 1 1 II i; II ii ill li II II II :i II II II ii II I! 1 1 II II II il ii 1 1 

Db 1035 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 1094 



Qy 61 CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 120 

I II II II II II I I II h II II Mill II II II I II II II II II II I I II II II II II II 
Db 1095 CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 1154 
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Qy 121 TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I II I I M II II II I I I! Ill I 

Db 1155 TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTAGGGTTCACCCCACCGCAC 1214 

Qy 181 GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 240 

I II I I I I II I I II I I II II I I I II I II III !l I I 111 ! II II I : II II I I I I II Ml I 

Db 1215 GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAATGCAAACCTTGCCAGCAAAT 1274 

Qy 241 CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 300 

lllllllllll II I I I I I I I I I I I I I I I I I I I II ' II I I I I I ' II II I I I I I I I I I I 

Db 1275 CCGCCTCCTGCCTCTACCAATCGCCAGTCAGGACGGCAGCCTACCCCGCTGTCTCCACCT 1334 

Qy 301 TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 360 

llllll llllllllllll MINIM llllllllllllll M I II II II I I II II 

Db 1335 CTGAGAACCACTCATCCTCAGGCCATGCACTGGAACTCCACAACCTTCCACCAAACTCTG 13 94 

Qy 3 61 CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 420 

I II 1 1 1 1 1 II I II 1 1 MM I I II 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 llllll 

Db 1395 CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGGACAGTA 1454 

Qy 421 AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

I Ml 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 M I M 1 1 1 II I II II 1 1 1 1 1 1 II II II 

Db 1455 AACCCTGTTCCGACTACTACCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 1514 

Qy 481 GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 540 

1 1 1 1 1 1 II 1 1 II 1 1 II I Ml II I II III III I II 1 1 1 1 II I II 1 1 1 1 II II II MM 

Db 1515 GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 1574 

Qy 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

1 1 1 1 1 1 M 1 1 M 1 1 II I M I II I II IIIIIMI II 1 1 M II 1 1 1 M I M II II 1 1 1 1 II 

Db 1575 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 1634 

Qy 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

I II 1 1 III I II 1 1 M II 1 1 II Mill III MM I II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II I 

Db 1635 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAATC 1694 

Qy 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 720 

II 1 1 1 1 1 1 1 1 1 I II 1 1 1 1 II II I II II I IMM II II 1 1 1 1 II 1 1 II I II I II II II II 

Db 1695 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 1754 

Qy 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

1 1 Ml I II M 1 1 1 1 1 M IM II 1 1 1 IM Ml MM IM II 1 1 IM M I! II 1 1 II II 

Db 1755 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 1814 

Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

MM II I II II I III II III Ml II I II MM 1 1 II I II I II II II I llllllllllll 

Db 1815 CTGGACTATC7VAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCATCAACCACCAGC 1874 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

llllll I llllll IMMMMMIMM lllllll I II IM 1 1 II 1 1 Ml II II 

Db 1875 ACGGGATCCTGCAGAACCTGCACGACTCCTGCTCAAGGAATCTCTATGTATCCCTCCTGT 1934 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

II I II II I II I II 1 1 1 MMM II I II MM 1 1 II III II II IM 1 1 II II III II II 

Db 1935 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 1994 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

III III III II I II 1 1 MM Mill I II MM II 1 1 M 1 1 II M II 1 1 1 II III II II 

Db 1995 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCTTGGCTCAGTTTACTAGTG 2054 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I II 1 1 II 1 1 II II 1 1 1 1 II II 1 1 1 1 M 1 1 1 1 1 M MM I II II II II I II II III II II 

Db 2055 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCATTGTTTGGCTTTCAGTTATATGGATG 2114 
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Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

II II III II I I I ! I I M i I II I I I I I I I I I I I I I I I I I I I I II II I I II I I I I I I I I 
Db 2115 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 2174 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

I I III I II I I I II I I II I I : I 1 i I I I I I I I I I I I I I Ml 
Db 2175 ATTTTCTTTTGTCTTTGGGCATACATTTAAACCCTAACAAA 2215 



RESULT 5 

HPBADYW 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
PUBMED 
COMMENT 



FEATURES 

source 



CDS 



CDS 



HPBADYW 2743 bp DNA linear VRL 05-DEC-2005 

human hepatitis b virus subtype adyw antigen genes (core antigen 
and surface antigen) . 
J02202 

J02202.1 GI: 329637 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; 
Orthohepadnavirus . 
1 (bases 1 to 2743) 

Pasek,M., Goto,T., Gilbert, W., Zink,B., Schaller,H., MacKay,P., 
Leadbetter,G. and Murray, K. 

Hepatitis B virus genes and their expression in E. coli 

Nature 282 (5739), 575-579 (1979) 

399329 

Original source text: hbv subtype adyw from human. 

cf hbvayw and whvsag. hbcag is hepatitis b core antigen protein, 

and hbsag is surface antigen protein. 

Location/Qualifiers 

1. .2743 

/organism^ "Hepatitis B virus" 
/mol_type= "genomic DNA" 
/specif ic_host= "Homo sapiens" 
/db_xref="taxon: 104 07" 
/note- " subtype : adyw " 
88. .639 
/ c odon_s t ar t = 1 
/product="core antigen" 
/protein_id= "AAA45486 .1" 
/db_xref ="GI : 329638 " 

/translation "MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTAAALYRDALES 

PEHCSPHHTALRQAILCWGDLMTLATWVGTNLEDPASRDLWSYVNTNVGLKFRQLLW 

FHI SCLTFGRET VLEYLVS FGVWIRTP PAYRPPNAP I LSTLPETT WRRRGRS PRRRT 

PS PRRRRSQS PRRRRSQSRESQC " 

1524. .2204 

/ codon_s t ar t = 1 

/product=" surf ace antigen" 

/protein_id="AAA45487 .1" 

/db_xref="GI : 329639" 

/translation "MENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGG 
TTVCLGQNSQSPISNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQG 
MLPVCPLIPGSSTTSTGSCRTCTTPAQGISMYPSCCCTKPSDGNCTCIPIPSSWAFGK 
FLWEWASARFSWLSLLVPFVQWFVGLSPIVWLSVIWMMWYWGPSLYSILSPFLPLLPI 
FFCLWAYI " 



ORIGIN 



Query Match 95.9%; Score 1133; DB 10; Length 2743; 

Best Local Similarity 97.5%; Pred. No. 0; 
Matches 1151; Conservative 0; Mismatches 



30; Indels 0; Gaps 0; 



Qy 



1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 
I I I I I I I I I I I I I i I I I I I I I I I I I M I I I I I I I I ! I 1 I I I I I I I I I I I I I I I I I I I II I 
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Db 


1035 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


1094 


Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 




INN MM III II II 1 III MM MM lllllllllll III II Mil III III II MM 




Db 


1095 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


1154 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 




1 II 1 1 1 III 1 h II II 1 II 1 II II 1 II 1 II II 1 1 1 1 II M II M 1 1 III 1 M M M II 




Db 


1155 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTAGGGTTCACCCCACCGCAC 


1214 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M III 1 1 1 II II II 1 1 




Db 


1215 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAATGCAAACCTTGCCAGCAAAT 


1274 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 




MIMIMIM II MMMMMMIIMM IIIMIIIIIIMIIIIIIIMIIM 




Db 


1275 


CCGCCTCCTGCCTCTACCAATCGCCAGTCAGGACGGCAGCCTACCCCGCTGTCTCCACCT 


1334 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACC/^AACTCTG 


360 






MUM IMMIIIIIII MINIM 1 1 1 1 1 1 1 1 1 E E 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


1335 


CTGAGAACCACTCATCCTCAGGCCATGCACTGGAACTCCACAACCTTCCACCAAACTCTG 


1394 


Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






1 1 hi II II 1 1 II 1 1 II IM 1 1 1 1 II 1 INI II II II Ml 1 II II 1 1 II 1 II MMM 




Db 


1395 


CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGGACAGTA 


1454 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






1 M M II 1 1 1 1 1 1 M 1 1 1 1 1 II M 1 II II 1 II 1 1 1 1 1 1 1 1 II 1 1 II 1 1 II 1 1 II 1 1 1 1 




Db 


1455 


AACCCTGTTCCGACTACTACCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


1514 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






MM MM M II II II M M II M II M III II MM II II II M 1 1 II II 1 II 1 II II 




Db • 


1515 


GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


1574 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






M Ml M M II M 1 1 M II MMM Ml MM M 1 M M III M 1 1 Ml 1 M M 1 II II 




Db 


1575 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


1634 


Qy 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 






II 1 II M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 11 1 II 1 M . 1 1 1 1 1 




Db 


1635 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAATC 


1694 


Qy 


661 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


720 






1 1 II II 1 1 II II II II M II II 1 1 III II II II IMM 1 1 1 1 1 1 Ml 1 1 II II 1 1 M M 




Db 


1695 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


1754 


Qy 


721 


CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 


780 






1 II I M II 1 II II 1 1 II 1 1 1 1 II M M 1 II II II 1 1 1 M M 1 1 M M 1 1 1 II 1 II 1 II M 




Db 


1755 


CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 


1814 


Qy 


781 


CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 


840 






1 1 M MM III 1 1 III 1 1 f 1 1 1 II II 1 M II 1! 1 1 1 1 MM 1 1 II IIIIIIIIMM 




Db 


1815 


CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCATCAACCACCAGC 


1874 


Qy 


841 


ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 


900 






MMM 1 MMM 1 1 II 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 II 1 1 II M 1 1 1 1 II 1 1 1 II 1 1 




Db 


1875 


ACGGGATCCTGCAGAACCTGCACGACTCCTGCTCAAGGAATCTCTATGTATCCCTCCTGT 


1934 


Qy 


901 


TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 


960 






1 MM MM II M 1 1 1 1 M M 1 II II MM MMM II II 1 1 II Ml 1 II 1 1 1 MM II 1 




Db 


1935 


TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 


1994 


Qy 


961 


TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 


1020 



i ii ii iii mi iii iii iii i iii i iii ii iiiiiiiiiii mi Minimi iii i 
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Db 


1995 


Qy 


1021 


Db 


2055 


Qy 


1081 


Db 


2115 


Qy 


1141 


Db 


2175 



1995 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCTTGGCTCAGTTTACTAGTG 2054 



CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

1 1 1 1 1 i! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 M ! 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 

CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCATTGTTTGGCTTTCAGTTATATGGATG 2114 



ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 114 0 

I I I I III I I I I I I I I I I I I I I I I I I I I I! ! I M I I I I I I I I I I I I I I I I I I I I I I I I 
2115 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 2174 

ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

I II I I I I II I I I I II II I I I I I II I I I I I I I I I I I I Ml 
ATTTTCTTTTGTCTTTGGGCATACATTTAAACCCTAACAAA 2215 



RESULT 6 
AB104723 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



AB104723 1170 bp DNA linear VRL 26-JUN-2003 

Hepatitis B virus s gene for pre-S and S protein, complete cds, 
isolate: EG91. 
AB104723 

AB104723 .1 GI : 322612 06 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; 

Orthohepadnavirus . 

1 

Saudy,N., Sugauchi,F., Tanaka,Y., Suzuki, S., Aal,A.A., Zaid,M.A., 
Agha,S. and Mizokami,M. 

Genotypes and phylogenetic characterization of hepatitis B and 

delta viruses in Egypt 

J. Med. Virol. 70 (4), 529-536 (2003) 

12794714 

2 (bases 1 to 1170) 

Suzuki, S., Saudy,N., Sugauchi,F., Orito,E., Agha,S. and Mizokami, M. 
Direct Submission 

Submitted (26-FEB-2003 ) Seiji Suzuki, Nagoya City University 
Graduate School, Department of clinical Molecular Informative 
Medicine; Mizuho, Nagoya, Aichi 467-8601, Japan 
(E-mail : seij is@med . nagoya- cu . ac . jp, Tel : 81-52-853-8292, 
Fax:81-52-842-0021) 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 

/mol_type= "genomic DNA" 

/isolate="EG91" 

/db_xref="taxon: 10407" 

/count ry= " Egypt " 

/not e= "genotype D" 

1. .1170 

/gene=" s" 

1. .1170 

/gene="s" 

/codon_start=l 

/product = "pre-S and S protein" 
/protein_id= "BAC78518 .1" 
/db_xref="GI : 32261207" 

/ 1 rans la t ion= " MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPANPPPASTNRQSGRQPTPLSPPLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVSPVPTTVSHISSIFSRIGDP 
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQS 
PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 
STTSTGPCRTCTTPAQGTSMYPSCCCTKPLDGNCTCI PI PS S WAFGKFLWE WAS ARFS 
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WLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI n 

ORIGIN 



Query Match 95.8%; Score 1131.6; DB 10; Length 1170; 

Best Local Similarity 97.9%; Pred. No. 0; 

Matches 1146; Conservative 0; Mismatches 24; Indels 0; Gaps 0; 



Qy 


i 


7V rp/-»/~i/~i/-i/-»7\/-'« , A 7v rp/^"T"T"T 1 /^ , <^ 1 A PP A PP A A TPPTPTPPPA TTPTTTPPPP A PP A PPAPTTlVAT 

AIGGoCjLALjAAIV-1 1 1 LLALUACjCAAIuCI L1Lj(j(jA1 lull 1 V_H-oAv_V_AH-Avj1 HoVjAI 


D U 




III MM II! INI llllllllllllllllll llllllllll MM IMIIII MM M! 




Db 


i 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 


Qy 


D± 


PPAPPPTTPAPAPPA A A P A PP A APA A TPPAPA TTPPP ApT" r ?r , 7\ A T'PPP A apiv appa pa pp 
V_UAVjL.V- I 1 LnuAoLAnALnLLAALAA 1 LtAuA 1 1 IjvjoAC 1 1 CAA 1 L-CV-AAL-AAVjoA^-AL-U 


ion 




IMIIII III MM IMIIII MM MIMII IMIIII MM IMIIII MM III 




Db 


61 


CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 


Qy 


1 O T 
±21 


THPPPAP A CTTT'Ti APA R PPT'TA PP APPTPPAPPATTPPP A PTPPPPTTP 7\ PPPP7A r*r*C % C t '&C i 

1 UbLLAbALbLLAALAAUb 1 AuuAbt 1 boAbLAl 1 LovjAL 1 oovjtjl ICAL-LLLAV-V-Uv-AL 


ion 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMI 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 




Db 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGATTCACCCCACCGCAC 


180 


Qy 


TOT 
±8 1 


A PPPPTTTTPPPPTPP A PPPPTP A PPPTPA PPPP A T 1 A A PA PA A A PPT^TPPP 7\ PP A A AT 


24 U 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 M 




Db 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACCTTGCCAGCAAAT 


240 


Qy 


1 A 1 

241 


PPPPPTPP'PPP'P'PPPAPPA 71 TPPPPH PT^P A PP A A PPP A PPPT A OP/^/^/^OT<0'T'/^ , T>/^/^ 1 A PPT 

LLbLL 1 LL I CjL 1 1 CLALCAAl LLjLLACjI LAUCjAACjCjLACjUC 1 ACt-LCGL 1 G 1 L I CLALL 1 


3 00 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 II 1 II MIMIMMII 




Db 


241 


CCGCCTCCTGCTTCTACCAATCGCCAGTCAGGAAGGCAGCCTACCCCTCTGTCTCCACCT 


300 


Qy 


iUl 


•PTPAP A A A PA PTPATPPTPA APPPA TPPAPTPPA APTPPAPA A pmrnm/1/1 t\ rtn» 7\ 71 nrnprpp 

1 loAtjAAALAL 1LAI iLAAuULAltjLACji^ J. J.CLALLAAAL I L 1(j 


"i £. f\ 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 M M M 1 M M II M 1 1 II 1 1 1 1 1 1 1 II 1 1 1 




Db 


301 


CTGAGAAACACTCATCCTCAGGCCATGCAGTGGAACTCCACAACCTTCCACCAAACTCTG 


360 


Qy 


361 


PA APAT<PPPAPA PT>P A P A PPTPTPTA mmmpppmppmp/impppmppii PTTPAPPA A P A P*"P A 

LAAGA1 CLLAGAG1 GAGAGGTL TGrAl TTLLL 1 GCTGGTGGCTCCAGTTC-AGGAALAG TA 


420 






1 1 1 1 1 II 1 1 1 1 1 II 1 M M 1 II II 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 i II 1 1 1 1 1 1 1 1 




Db 


361 


CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 


Qy 


421 


A APPPTPTT'PPPAP'TiAPTip'TiPTP'I'PPPATATipPTPA An"iPn"i*"PPrpPPAPPA rnrpPPPP A PPPT« 

AACCCTGTTCCGACTACTGTCTCTC 


480 






1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


AGCCCTGTTCCGACTACTGTCTCTCACATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 


Qy 


/I O T 

4ol 


P r^C^r^f^f^ A APAT*PPAPA APATPAPArnPAPPAH^TT^PTlAPPAPPPPTiPPn^PPn^PrprpAPAPP PP 

GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGaCCCCTGCTC 


540 






MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 


/-\, - 

Qy 


54 1 


PPPT*n^TM"PrpPTirppnT"PP A P A A P A A T'PP'"PP A PA A T» A PPPPAPAprpprpAPAPrpPPrpPPrpPPA PT< 

GGGI 11 rrLTTGTTGALAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 


Qy 


601 


TCTCTCAaTTTTCTAGGGGGAACTaCCGTGTGTCTTGGCCAAAAT 


660 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 


Qy 


661 


tccaatcactcaccaacctcctgtcctccaacttgtcctggttatcgctggatgtgtc 


720 






1 MM II 1 1 1 1 II II I II 1 1! Ill 1 .11 II MM 1! II II II II 1 1 II II 1 1 II 1 1 II 1 1 




Db 


661 


tccaatcactcaccaacctcctgtcctccaacttgtcctggttatcgctggatgtgtctg 


720 


Qy 


721 


cggcgttttatcatcttcctcttcatcctgctgctatgcctcatcttcttgttggttctt 


780 






II III II II M II 1 1 1 1 MM Ml M II II II II M III II Ml II II II 1 M 1 II II II 




Db 


721 


cggcgttttatcatcttcctcttcatcctgctgctatgcctcatcttcttgttggttctt 


780 


Qy 


781 


ctggactatcaaggtatgttgcccgtttgtcctctaattccaggatcttcaaccaccagc 


840 






1 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 II 1 1 II 1 1 1 M 1 1 1 M 1 1 II M 1 M M 1 1 II 1 1 1 1 1 1 1 M 1 




Db 


781 


ctggactatcaaggtatgttgcccgtttgtcctctaattccaggatcttcaaccaccagc 


840 


Qy 


841 


acgggaccatgcagagcctgcacgactcctgctcaaggaacctctatgtatccctcctgt 


900 
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1 1 1 1 1 1 ! II I 1 1 1 1 1 I IIIIIIIIIIIIMI II II IMIIIII II M II II II llllll 

Db 841 ACGGGACCATGCAGAACCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

IMIIIII Illllll III Mill I I I I I I I I I I I I I I I I I I I I I il II I I I I I I I I 
Db 901 TGCTGTACCAAACCTTTGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

I M II II I I I I I I I I I I I I I I M I I I I I I I I ! I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

II I II II 1 1 1 1 1 Ml 1 1 ! 1 1 II 1 1 1 II I II II 1 1 1 1 1 1 1! 1 1 1 1 1 II I II II 1 1 1 1 1 1! I 

Db 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM III MMMMMMMMMI 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 II II 

Db 1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 



Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

1 1 1 1 1 1 1 : 1 1 1 1 ! 1 1 1 r 1 1 1 1 : 1 1 1 1 1 1 1 

Db 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 



RESULT 7 
AB104715 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



AB104715 1170 bp DNA linear VRL 26-JUN-2003 

Hepatitis B virus s gene for pre-S and S protein, complete cds, 
isolate: EG69. 
AB104715 

AB104715.1 GI:32261190 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; 

Or thohepadna virus . 

1 

Saudy,N., Sugauchi,F., Tanaka,Y., Suzuki , S . , Aal,A.A., Zaid,M.A. , 
Agha,S. and Mizokami,M. 

Genotypes and phylogenetic characterization of hepatitis B and 

delta viruses in Egypt 

J. Med. Virol. 70 (4), 529-536 (2003) 

12794714 

2 (bases 1 to 1170) 

Suzuki, S., Saudy,N., Sugauchi,F., Orito,E., Agha,S. and Mizokami,M. 
Direct Submission 

Submitted (26-FEB-2003) Seiji Suzuki, Nagoya City University 
Graduate School, Department of clinical Molecular Informative 
Medicine; Mizuho, Nagoya, Aichi 467-8601, Japan 
(E-mail : seij i s@med. nagoya -cu. ac . jp, Tel : 81-52-853-8292 , 
Fax:81-52-842-0021) 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 

/mol_type= "genomic DNA" 

/isolate="EG69" 

/db_xref="taxon: 10407" 

/count ry = " Egypt " 

/not e= "genotype D" 

1. .1170 

/gene="s" 

1. .1170 

/gene="s" 

/codon start=l 
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/product="pre-S and S protein" 
/protein_id= "BAC78510 . 1 " 
/db_xref="GI : 32261191" 

/ translations" MGQNLSTSNPLGFFPDHQLLPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPANPPPASTNRQSGRQPTPLSPPLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTVSPISSIFSRIGDP 
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQS 
PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 
STTSTGPCRTCTTPAQGTSMYPSCCCTKPLDGNCTCIPIPSSWAFGKFLWEWASARFS 
WLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 

ORIGIN 



Query Match 95.5%; Score 1128.4; DB 10; Length 1170; 

Best Local Similarity 97.8%; Pred. No. 0; 

Matches 1144; Conservative 0; Mismatches 26; Indels 0; Gaps 0; 



Qy 


l 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 




1 II II II : l 1 1 II II 1 1 1 1 1 1 I III II II MM II III III II II 1 1 II 1 1 II i 1 1 




Db 


1 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGTTA 


60 


Qy 


61 


CCAGCCTTCAGAGCAAACACCT^ACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 






Mill II M 1 1 II II 1 1 1 1 1 1 II II 1 1 1 1 II 1 1 II II II II 1 1 1 1 1 II 1 1 M 1 II 1 




Db 


61 


CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 






1 II II II M 1 III II 1 II 1 II II 1 MM II II 1 II 1 Mill MMMIMIMM 




Db 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGATTCACCCCACCGCAC 


180 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 






1 MM II M 1 II 1 1 1 1 II 1 1 1 III II II II 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 M 1 1 M 1 II 1 




Db 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACCTTGCCAGCAAAT 


240 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 






llllll MUM II 1 M 1 M 1 II 1 1 1 II 1 II 1 1 M 1 M 1 1 II 1 llllllllllll 




Db 


241 


CCGCCTCCTGCTTCTACCAATCGCCAGTCAGGAAGGCAGCCTACCCCTCTGTCTCCACCT 


300 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 II 1 1 II 1 1 1 1 II 1 M 1 1 MMMMMIMM 




Db 


301 


CTGAGAAACACTCATCCTCAGGCCATGCAGTGGAACTCCACAACCTTCCACCAAACTCTG 


360 


Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






lllllll MMIMIMI M II 1 II 1 III 1 1 MUM 1 1 1 II II 1 II 1 II 1 1 II II 




Db 


361 


CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






II II 1 1 1 M 1 MM 1 1 1 1 1 II II II 1 1 1 III 1 II II 1 II II M II III 1 1 1 II 1 1 II II 




Db 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






III 1 MM 1 II II 1 1 1 1 II II II 1 M 1 Ml II II 1 MM II Mill 1 1 1 II 1 1 II II 




Db 


481 


GCGTTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






1 1 1 II M 1 1 1 1 II 1 1 1 1 II II II III II 1 II 1 1 II II M M II II II 1 M M 1 1 1 1 1 II 




Db 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 


Qy 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 






1 II III Mill 1 1 II II MM MM III II II II 1 II 1 1 1 II III II II 1 1 MM II II 




Db 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 


Qy 


661 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


720 






Mill II III III II II II II III II II II lllllll II II III llllll II II II II II 




Db 


661 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


720 
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Qy 

Db 

Qy 
Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

Ml II I! Ml I II I! II II II Ml II lllllll Ml MM MM MM II MM II II II 

721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 



781 



840 



CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 

Ml II II III I II MM II MMIM MM III I II M II 1 1 II MM II II II IMIM 

781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 



841 



900 



ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 

IMIIIIIIIIIIII 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II M I II 1 1 1 M I M 1 1 1 1 M M I 

841 ACGGGACCATGCAGAACCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 



901 



960 



TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 

Ml II II I I III Ml III I MM III III I II II II 1 1 II Ml I II II MM III I 

901 TGCTGTACCAAACCTTTGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 
961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

M II II II 1 1 II M II II II II Ml IM Ml I II M II 1 1 II Ml III II I M Mill 

961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 
1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I II II I M M M Ml II MIMM M M IMMI M II I M I II II M MM M II II M 

1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 
1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM III 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 III I M I ! 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 : 1 1 

1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 
1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

I M II M 1 1 1 M 1 1 1 II II II 1 1 1 M 1 1 II 

1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 



RESULT 8 
BD181818 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BD181818 8007 bp DNA linear PAT 15-MAY-2003 

Hepatitis B virus vectors for gene therapy. 

BD181818 

BD181818 . 1 GI : 3 0792736 
JP 2002320480-A/3 . 
unidentified 
unidentified 
unclassified sequences. 
1 (bases 1 to 8007) 

Ryu,W., Jeong,J.K., Lee, J., Cho,W.Y. and Yoon,G.S. 

Hepatitis B virus vectors for gene therapy 

Patent: JP 2002320480-A 3 05-NOV-2002; 

WANG-SHICK RYU 

OS Unidentified 

PN JP 2002320480-A/3 

PD 05-NOV-2002 

PF 20-APR-2001 JP 2001122392 

PR 20-APR-2000 KR 2000-21070 , 12 -APR-2001 KR 2001-19645 PI 
WANG-SHICK RYU, JONG KEUN JEONG , JEHAN LEE, WOO YOUNG CHO , GYE PI 
SOON YOON 

PC C12N15/09, A61K35/74,A61K48/00 / A61P3l/20,C12N7/00,C12N15/00 CC 

pCMV-HBV/30 Full Sequences 
CC 8007 bp ms-DNA circular 
CC From HBV-ayw 

CC #1 ; #1820 of HBV-ayw (accession number J02203) CC 
transcription start site of HBV pregenomic RNA CC #1 - 3360 ; 
HBV-ayw (178 bp overlapping the HBV genome -length) CC 5'- epsilon 
secondary structure ,- bases 30 - 90 CC 3*- epsilon secondary 
structure ; bases 3212 - 3272 CC 5'- DR1 ; bases 7-17 
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FEATURES 

source 



ORIGIN 



CC 3'- DR1 ; bases 3189 - 3199 
CC 3'- DR2 ; bases 2955 - 2965 
CC Poly A signal ; bases 3281-3286 

CC Core ORF ; bases 84 - 632 (exclude stop codon) CC Polymerase 
ORF ; bases 490 - 2985 (exclude stop codon) CC SI ORF ; bases 
1031 - 2197 (exclude stop codon) CC S2 ORF ; bases 1355 - 2197 
(exclude stop codon) CC S ORF ; bases 1520 - 2197 (exclude stop 
codon) CC X ORF / bases 2739 - 3200 (exclude stop codon) CC 
From pcDNAl/Amp 

CC Col El origin ; bases 5103-5689 (1-587 of pcDNAl/Amp) CC M13 

origin ; bases 5690-6282 

CC Ampicillin gene ; bases 6462-7405 

CC CMV promoter / bases 7406-7999 

CC SP6 primer sequence ; bases 3372-3390 

CC Splice and polyA ; bases 3391-4089 

FH Key Location/Qualifiers 

FT source 1. .8007 

FT /organism= * Unidentified ' . 

Location/Qualifiers 
1. .8007 

/organism^ "unidentified" 
/mol_type= "genomic DNA" 
/ db_xr e f = " t axon : 3 2 6 4 4 " 



Query Match 95.5%; 
Best Local Similarity 97.2%; 
Matches 1148; Conservative 



Score 1128.2; 
Pred. No. 0; 
0; Mismatches 



DB 2; Length 8007; 



33; Indels 



0 ; Gaps 



0; 



Qy 1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 

I 1 1 1 1 I II II 1 1 I 1 1 1 1 II II 1 1 II II I 1 1 I II 1 1 1 1 1 1 1 1 1 1 !l I II II II 1 1 1 1 II 

Db 1031 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 1090 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



61 CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 12 0 

I I I I I I I I I I I I I I I I I I I I I II II II I I I I II II II I II I I I I I I I I I I I I I I 

1091 CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 1150 



121 



180 



TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 

i ii 'i 1 1 ii ii ii 1 1 1 1 ii ii 1 1 1 1 1 1 1 ii in ii 1 1 iiiii , 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1151 TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGTTTCACCCCACCGCAC 1210 



181 



240 



GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 

1 1 II II II II II 1 1 1 1 1 II I II 1 1 II 1 1 1 1 1 1 1 1 1 II Mill IIIII MINI 

1211 GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACTTTGCCAGCAAAT 1270 



241 



300 



CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 

iiiiiiiiii 1 1 1 1 1 1 1 1 1 II 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii ii 1 1 1 1 1 1 1 1 1 1 1 M 1 1 

12 71 CCGCCTCCTGCCTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 1330 



301 



360 



TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II I II III I MINIM MIMMMMIMI 

1331 TTGAGAAACACTCATCCTCAGGCCATGCAGTGGAATTCCACAACCTTCCACCAAACTCTG 1390 



361 



420 



CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 

1 1 M II II M 1 1 1 1 1 1 II I II I M 1 1 II MM 1 1 II II II II MM I III I M 1 1 1 II 

1391 CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 1450 
421 AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

IIIIIIIIII MINIM MINN 1 1 1 1 II I II II I IN 1 1 1 1 1 1 M 1 1 M 1 1 1 II 

1451 AACCCTGTTCTGACTACTGCCTCTCCCTTATCGTCAATCTTCTCGAGGATTGGGGACCCT 1510 



481 



540 



GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 

I I I I I I I II I I I I II II II I II II II I I II I I I I I I I I II I I I IIIIIIIIIMIMI 
1511 GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTTCTCGTGTTACAGGCG 1570 
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Qy 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 M I (I I ! 1 1 

Db 1571 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 1630 

Qy 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

I I I II I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I M I I I I I I I I : I I I I II I I 
Db 1631 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 1690 

Qy 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 

1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 II III II II II II II 111 III 1 1 1 1 1 II 1 1: 1 1 II II 1 1 

Db 1691 TCCAATCACTCACCAACCTCTTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 1750 

Qy 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

I II II II II INI I II Mill III MM II III I II II III I II Mil I Hill II II II 

Db 1751 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 1810 

Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 84 0 

I II II I IN II 1 1 1 1 1 1 1 II I II II 1 1 1 II II II 1 1 1 1 1 1 1 1 1 1 II I 11)11 MUM 

Db 1811 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCCTCAACAACCAGC 18 70 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

IMMMMMI I MUM MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1871 ACGGGACCATGCCGGACCTGCATGACTACTGCTCAAGGAACCTCTATGTATCCCTCCTGT 1930 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

I II II II I II II III II II HIM 1 1 1 1 1 1 1 1 1 1 ■ 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

Db 1931 TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 1990 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

i ii II ii ii II II 1 1 1 ill ii ill I ii M ii 1 1 1 1 ii i; 1 1 II 1 1 1 II 1 1 I II II 1 1 1 

Db 1991 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 2050 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I II II II I: I I I II I II III .11 II III MM II I I II II I I I I I I I I I I I I I I II II I 
Db 2 051 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 2110 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM Ml 1 1 II 1 1 1 1 III. II 1 1 1 II Ml II 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II I 

Db 2111 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 2170 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

I II II II II II II 1 1 1 III II III II I II I Ml I II I Ml 

Db 2171 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAACAAA 2211 



RESULT 9 
AY603461 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 



AY603461 1170 bp DNA linear VRL 19-APR-2005 

Hepatitis B virus isolate 04T large S protein (S) , middle S protein 
(S) , and S protein (S) genes, complete cds . 
AY603461 

AY603461.1 GI:47499935 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; 
Orthohepadnavirus . 
1 (bases 1 to 1170) 

Sominskaya, I . , Mihailova,M. , Jansons , J . , Emelyanova, V . , 
Folkmane,I., Smagris,E., Dumpis,U., Rozentals # R. and Pumpens,P. 
Hepatitis B and C virus variants in long-term immunosuppressed 
renal transplant patients in Latvia 
Intervirology 48 (2-3), 192-200 (2005) 
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PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



CDS 



CDS 



15812194 

2 (bases 1 to 1170) 

Sominskaya, I . , Mihailova,M. , Jansons, J. , Emelyanova, V. , 
Folkmane,I., Smagris,E., Dumpis,U., Rozentals,R. and Pumpens,P. 
Direct Submission 

Submitted (21-APR-2004) Protein Engineering Department, Biomedical 
Research and Study Centre, University of Latvia, Ratsupites str. 1, 
Riga LV-1067, Latvia 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 
/virion 

/mol_type= "genomic DNA" 
/isolate= ,, 04T" 

/isolation_source= "renal transplantation patient" 
/db_xref="taxon: 10407" 
/ count ry = " Latvia " 

/note= " subtype : ayw3; genotype: D" 

1. .1170 

/gene="S" 

1. .1170 

/gene="S" 

/note="pre-Sl/pre-S2/S; LHBS/MHBS/HBsAg ; surface antigen" 

/codon_start=l 

/product=" large S protein" 

/protein_id="AAT28720.1" 

/db_xref="GI :47499936" 

/translation "MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 

NKVGAGAFGLGFTPPHGGLLGWSPQAQGIIQTLPANPPPASTNRQSGRQPTPLSPPLR 

NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTVSPISSIFSRIGDP 

ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQS 

PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 

STTSAGPCRTCTTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLWEWASARFS 

WLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 

325. .1170 

/gene="S" 

/note="pre-S2/S; MHBS/HBsAg; surface antigen" 

/codon_start=l 

/product= "middle S protein" 

/protein_id= "AAT28721 .1" 

/db_xref="GI : 47499937" 

/ translation "MQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTVSPISS 

IFSRIGDPALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTV 

CLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLP 

VCPLIPGSSTTSAGPCRTCTTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLW 

EWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFC 

LWVYI" 

490. .1170 

/gene= n S" 

/note="HBsAg; surface antigen" 
/codon_start=l 
/product="S protein" 
/protein_id= " AAT28722 . 1 " 
/db_xref="GI : 47499938" 

/translations "MENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGG 
TTVCLGQNSQSPTSNHS PTSC PPTCPGYRWMCLRRFI I FLF I LLLCL I FLLVLLDYQG 
MLPVCPLIPGSSTTSAGPCRTCTTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGK 
FLWEWASARFSWLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPI 
FFCLWVYI " 



ORIGIN 



Query Match 95.4%; Score 1126.8; DB 10; 

Best Local Similarity 97.7%; Pred. No. 0; 
Matches 1143; Conservative 0; Mismatches 27; 



Length 1170; 
Indels 0; Gaps 



0; 
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Qy 


i 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 






Mill 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 ! 1 1 




nh 


1 

X 


atppj^ ap ap a atptttpp a ppapp a a tpptptpppattptttpppp. a pp a ppapttpp at 


£ n 
0 u 


Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 






lllllllll Nil IIIMIII IIIIIIIIIIMMII MM llllllllll IMIII 






D J. 


PPAPPPTTP AP APP A A APAPPPPA A fiTPPIiP A TTPPP A PTTP A ATPPPA AP A APPAPAPP 


1 9 rt 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 






1 1 E 1 1 1 1 1 1 1 1 1 1 i 1 1 1 E 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 MIM 1 1 1 1 1 1 1 II II 1 II 1 




nh 




TnP.PPAPA.PP.PPA. APA APPTAPPAPPTPPAPP ATTPPPPPTPPP ATTPAPPPPAPPPP AP 


1 ft Pi 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 






1 II 1 1 M II 1 1 1 1 1 1 1 1 1 II M 1 II 1 1 1 II 1 1 II II 1 1 1 MINI 1 1 1 1 1 1 1 1 1 1 1 1 




nh 


1 ftl 


PP A PPPPTTTTPPPPTPP A PPPPTP A PPPTP A PPPPA TP AT APA A APTTTPPPAPPA A AT 


9/i n 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 






MMMMIM 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 






941 


PPPPPTPPTPP ATPP APP A ATPPPPAPTPAPP A APPPAPPPT APPPPPPTPTPTPP APPT 
V-V_VjV_V_ 1 \-V- 1 r\ 1 ^LALLAAl L.o\-L.Avj1 V„AkjuAAvjVjV - Auv_L. 1 AV-V-U-I-IjL. lulLl L.L-AL.L. 1 


1 rt r\ 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






Mill! Ml II II lllllll 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 MMMMMIMM 




nh 




TTPAPA A AP APTP ATPPTP APPPPATPP APTPPA APTPPAPA APPTTPPAPP A A APTPTP 




Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






MMMMIM MMMM M 1 M M M 1 M II M M M M M M 1 1 M 1 M 1 M M 1 




Db 


J O X 


PA APATPPPAPPPTPAPAPPPPTPTATTTPPPTPPTPPTPPPTPPAPTTP APP A AP APT A 




Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






lllllllllllll MMMIMMMMIIMMIMMMMMIIMIMIIIIMM 




nh 


4 91 
*± X 


A APPPTPTTPPP A PT A PTPTPTPTPPP A T A TPf'TP A A TPTTPTPPA C*r* JVTTPPPPA /-i/-i/-trp 
nhLLLlol ILtuALlAtlVJlLlLl^LtAlAlLulUAAltl 1 L. H-VjAvjbjAl 1VjoVjVjAL,V_L 1 


A O rt 

4b0 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






MM IIIMIII MMMIMMMIMI IMIIMIMIMIMIMMMMIMM 




Db 


4 ft 1 
*± o X 


PPPPTPA AP ATPPAP A APATPAPATPAPPATTPPTAPP IPPPPTPPTPPTPTTA PA PPPf 
1 \jHJ\\^i\ 1 ovjA^jAAV-A 1 ^- AL.A 1 v_AoLjA 11LL1 AulaAl-CCt- 1 (jL ILblvjl lALAboLo 


c a r\ 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






IMIIIIMMMIMMMMMMMMMMIIIIMIMMMIMMMIMI II 




Db 


^41 


PPPTTTTTPTTPTTPAP A APA ATPPTP APA AT APPPPAPAPTPT APA PT , PP r PPP p Pr , P A PT 1 
ouu 1 lul J. uAv-AAuAAl tL 1 LALAA1 ALLuLAbAb 1 L 1 AuAL 1 Lbi bb 1 obAL 1 


£ rt rt 

600 


Qy 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 






MMMIMMMMMMIMIMMMIIMMMMMIMMMMIMIMM II 




Db 


QUI 


TPTPTPA ATTTTPT A PPPPP A SPTRPPPTPTPTPTTPPPPA AAA TTPPP APT" PPPfi A Tinp 
1 V_ H_ 1 11 J.L 1 AuVjuVj^AAb. lALLb 1 \j lu 1 ^ 1 IbbLLAAAAl 1 UuL-Atji Cv-LLAACL. 


r e r\ 

660 


Qy 


661 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


720 






Ml MIMIIMI Ml IMMMMIIIMMII IMIII 1 Ml MUM MM MM II 




Db 


O O X 


TPP A ATP APTP APP A A PPTPPTPTPPTPPA A PTTPTPPTPPTT J TPPPTPP A TPTPTpfpp 
l LAL 1 V^ALLAALL 1 LL 1 u 1 bt 1 LUAAL 1 IblLLlbbl lAlbbL IbbAlblblL lb 


t 0 rt 
0 


Qy 


721 


CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 


780 






MMMMMMMMMMMMMMMIMMMMMMMMMMMMMMI 




Dh 


791 


Pnr.PP.TTTTSTPRTPT'PPPTPTTPRTPP'PPPTPP'PATPPPTP A TPTTPTTPTTPPTTPTT 

1 1 1 1A1LA1L1 ItblLl ILAlbblbLlbb lAlbbt lL.Alb.1 1 L. 1 Ibbl iCl 1 


*7 O rt 

/oO 


Qy 


781 


CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 


840 






III lllllllllllllllllllllll III lllllll MM Ml lllllll III IMIII 




nh 


*7 ft 1 
/OX 


PTPP APTATPA A PPT A TPTTPPPPPTPTPTPPTPT A ATTPPH PHTV rp/^irprp /-\ 71 tv /-1 /~i t\ a 

>- HjVjAL. 1 Al L-AAVjVjI Albl HjUUv^vjI L ibil v-L. 1 L. 1 AA 1 l L L AGbrA I L I 1 CAACCACCAGC 


840 


Qy 


841 


ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 


900 






MMMMMMM MMMMMI M M M M M M M M M M M M M M M M 




Db 


841 


GCGGGACCATGCAGAACCTGCACGACTACTGCTCAAGGAACCTCTATGTATCCCTCCTGT 


900 


Qy 


901 


TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 


960 






MMMM MMMMIM MMI 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 




Db 


901 


TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 


960 
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Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 ! I II 1 1 1 1 1 1 1 ! 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1! 1 1 

Db 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

II ! I I I I I I I II I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I 
Db 1021 CCCTTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

1 1 1 1 III II 1 1 II III Ml Mi II 1 1 1 1 1 1 1 1 1 1 1 1 M I II 1 1 1 1 1 1 II I M 1 1 M I 

Db 1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

Illlilll lllllllllllllllllllll 
Db 1141 ATTTTCTTCTGTCTTTGGGTATACATTTAA 1170 



RESULT 10 

DQ219811 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



DQ219811 3207 bp DNA linear SYN 23-OCT-2005 

Synthetic construct isolate Ll.l Hepatitis B virus precore 
(precore) , polymerase- reverse transcriptase (Pol) , PreSl (PreSl) , 
and X protein (X) genes, complete cds . 
DQ219811 

DQ2 19811 . 1 GI : 77819763 

synthetic construct 
synthetic construct 

other sequences; artificial sequences. 

1 (bases 1 to 3207) 

Thakur,V., Kazim, S.N., Guptan,R.C, Malhotra,V. and Sarin, S.K. 
Molecular epidemiology and transmission of hepatitis B virus in 
close family contacts of HBV-related chronic liver disease patients 
J. Med. Virol. 70 (4), 520-528 (2003) 
12794713 

2 (bases 1 to 3207) 

Chakraborty , A . K . , Chauhan , R . and Sarin , S . K . 

A consensus method for numbering and representing Hepatitis B Virus 

genome useful for recombinant vector design 

Unpublished 

3 (bases 1 to 3207) 

Chakraborty , A. K. , Chauhan , R . and Sarin , S . K. 
Direct Submission 

Submitted (21-SEP-2005) Gastroenterology, G. B. Pant Hospital, 
ICMR-Advanced Center for Liver Diseases, New Delhi, Delhi 110002, 
India 

Location/Qualifiers 
1. .3207 

/organism=" synthetic construct" 

/mol_type= "other DNA" 

/ serotype= " ayw " 

/isolate="Ll.l" 

/ db_xr e f = " t axon : 3 2 6 3 0 « 

/country=" India" 

/note= "derived from Hepatitis B virus isolated from 
patient with chronic HBV infection 
genotype: D" 
1. .639 

/gene= "precore " 
1. .639 

/gene= "precore " 

/note= "contains core protein and eAg" 
/codon start=l 
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/transl_table=ll 
/product = " precore " 
/protein_id= " ABB04014 . 1 " 
/db__xref = "GI : 77819764" 

/translation= ,, MQLFHLCLIISCSCPTVQASKLCLGWLWGMDIDPYKEFGATVEL 
LSFLPSDFFPSVRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMTLATWVG 
VNLEDPASRDLWSYVNTNMGLKFRQLLWFHISCLTFGRETVIEYLVSFGVWIRTPPA 
YRPPNAPILSTL PETT WRRRGRS PRRRT P S PRRRRSQS PRRRRSQS RE SQC " 

gene 494. .2992 

/gene=" Pol" 

CDS 494. .2992 

/gene="Pol" 

/note= "multifunctional protein having both DNA polymerase 

and reverase transcriptase activity" 

/codon_start=l 

/transl_table=ll 

/product = "polymerase- reverse transcriptase " 
/protein_id= " ABB040 15 . 1 " 
/db_xref = "GI : 77819765 " 

/translation "MPLSYQHFRRLLLLDDEAGPLEEELPRLADEGLNRRVAEDLNLG 
NLNVSIPWTHKVGNFTGLYSSTVPVFNPHWKTPSFPNIHLHQDIIKKCEQFVGPLTVN 
EKRRLQLIMPARFYPKVTKYLPLDKGIKPYYPEHLVNHYFQTRHYLHTLWKAGILYKR 
ETTHSASFCGSPYSWEQDLLHGAESFHQQSSGILSRPPVGSSLQSKHRKSRLGLQSQQ 
GHLARRQQGRSWSIRAGFHPTARRPFGVEPSGSGHTTNFASKSASCLHQSPVRKAAYP 
AVSTFEKHSSSGHAVELHNLPPNSARSQSERPVFPCWWLQFRNSKPCSDYCLSLIVNL 
LEDWGPCAEHGEHHIRIPRTPSRVTGGVFLVDKNPHNTAESRLWDFSQFSRGNYRVS 
WPKFAVPNLQSLTNLLSSNLSWLSVDVSAAFHHLPLHPAAMPHLLVGSSGLSRYVARL 
SSNSRILNNQHGTMPDLHDYCSRNLYVSLLLLYQTFGRKLHLYSHPI ILGFRKI PMGV 
GLSPFLLAQFTSAICSWRRAFPHCLAFSYMDDWLGAKSVQHLESLFTAVTNFLLSL 
GIHLNPNKTKRWGYSLNFMGYVIGCYGSLPQEHIIQKIKECFRKLPINRPIDWKVCQR 
IVGLLGFAAPFTQCGYPALMPLYACIQSKQAFTFSPTYKAFLCKQYLNLYPVARQRPG 
LCQVFADATPTGWGLVMGHQRMRGTFSAPLPIHTAELLAACFARSRSGANIIGTDNSV 
• VLSRKYTSFPWLLGCAANWILRGTSFVTVPSALNPADDPSRGRLGLSRPLLRLPFRPT 
TGRTSLYADS PS VPSHLPDRVHFAS PLHVAWRP P " 

gene 1035. .2204 

/gene="PreSl" 

CDS 1035. .2204 

/gene="PreSl" 

/note=" surf ace protein; antigenic protein; virus entry and 

cell surface recognition" 

/codon_start=l 

/transl_table=ll 

/product="PreSl" 

/protein_id= "ABB04016 .1" 

/db_xref="GI : 77819766" 

/ trans la tion="MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPANPPPASTNRQSGRQPTPLSPPLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVLTTASPLSS I FSRIGDP 
ALNMENI TSGFLGPLLVLQAGFFLLTRI LT I PQSLDS WWTSLNFLGGTTVCLGQNSQS 
PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 
STTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLWEWASARFS 
WLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 

gene 2743. .3207 

/gene="X" 

CDS 2743. .3207 

/gene="X" 

/note=" transact ivator protein; role in viral pathogenesis 

including hepatocellular carcinoma; transactivates many 

cellular genes" 

/ codon_s t a r t = 1 

/transl_table=ll 

/product ="X protein" 

/protein_id= "ABB04017 . 1 " 

/db_xref ="GI : 77819767" 
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/ 1 rans lat ion= " MAARLCCQLDPARDVLCLRPVGAESCGRPFSGSLGTLSSPSPSA 
VPTDHGAHLSLRGLPVCAFSSAGPCALRFTSARRMETTVNAHQILPKVLHKRTLGLSA 
MSTTDLEAYFKDCLFKDWEELGEEIRLKVFVLGGCRHKLVCAPAPCNFFTSA" 
misc_f eature 3183. .3207 
/gene="X" 

/note="25 nt added to keep all reading frames in order; 
useful for vector design and genome analysis" 

ORIGIN 



Query Match 95.4%; Score 1126.6; DB 8; Length 3207; 

Best Local Similarity 97.1%; Pred. No. 0; 

Matches 1147; Conservative 0; Mismatches 34; Indels 0; Gaps 0; 



Qy 


i 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 






I! 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : II 1 II II 1 1 1 : 1 1 1 II 1 1 1 1 1 1 II II 1 1 II II ! 1 1 1 II 




UD 




A 1GGGGC AGAA 1 C 1 1 1CCACCAGCAA1 CC1 CIGGGA1 ICI 1 1 CLL.oACLACL.Avj1 1UGA1 


iuy 4 


Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 




lllllllllllllllllllll 1 1 1 II 1 II 1 1 M II II 1 1 1 1 II II 1 1 II 1 i .. 1 1 1 II 




UD 


1 AQC 


CLAGLLi lLAGAGLAAALALLGLAAAl LLAGA1 IbbbALl 1LAA1LLLAALAAGGALACL 




Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 




Illllllllllll llllll IIIMIIIIIIII III III Mill Mlllllllllllll 




UD 


1155 


1GGCCAGACGCCAACAAGG1 AGGAGC IbbAblAI 1CGGGC IGGGI I ICACCCCACCGCAC 


1214 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllll MINIMUM 




DD 


1215 


GG AGG L L T TTT GGGGTGGAGC C C T C AGG C T C AGGG C AT AC T AC AAAC T T T G C C AGC AAAT 


1274 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 




lllllllllll 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 M 1 1 1 II 1 II 1 M 1 1 1 II 1 1 1 1 M 1 1 1 1 




UD 


lz /b 


CCGCC 1 CC I GCC 1 CCACCAAI CGCCAG1 CAGGAAGGCAGCC I ACCCCGC 1 Gl C i CCACC 1 


1334 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






II 1 II 1 M M 1 1 1 II 1 1 1 1 1 II 1 II II Mill 1 1 1 II II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




UD 


1335 


TTGAGAAACACTCATCCTCAGGCCATGCAGTC 


1394 


Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






1 1 II II 1 1 1 II 1 II 1 1 1 M III II II II II II 1 1 II II 1 II Ml M 1 II II M II 1 II 




UD 


1395 


CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAAC 


1454 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






lllllll II IIIIIMI MUM 1 II M 1 II II 1 1 1 1 1 1 1 II 1 1 1 II M II 1 1 




UD 


1455 


AACCCTGTTCTGACTACTGCCTCTCCCTTATCGTCAATCTTCTCGAGGATTGGGGA 


1514 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1515 


GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTTCTCGTGTTACAGGCG 


1574 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






II Ml M II III 1 1 1 II 1 1 1 Ml 1 II 1 II II M 1 II 1 II M 1 M 1 II M 1 1 1 M Mill 




Db 


1575 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


1634 


Qy 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 






1 II IMIMI III 1 MM II MM 1 II II 1 II M II II II II 1 M II 1 II 1 IMM III 1 




Db 


1635 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


1694 


Qy 


661 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


720 






MIIMI MIMMIMM II 1 II 1 1 1 1 II M 1 1 1 1 II 1 1 1 M 1 1 llllllllllll 




Db 


1695 


TCCAATCACTCACCAACCTCTTGTCCTCCAACTTGTCCTGGTTATCGGTGGATGTGTCTG 


1754 


Qy 


721 


CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 


780 






IIIIIMI lllllll III Mlllllllllll lllllllll III llllllllll MM II 




Db 


1755 


CGGCGTTTCATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 


1814 
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Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 II 1 1 1 1 1 1 II ! 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1815 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCCTCAACAACCAGC 1874 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

MINIMUM I MINI MM 1 1 1 1 1 1 M II I M I M 1 1 II I M I M M 1 1 1 1 

Db 1875 ACGGGACCATGCCGGACCTGCATGACTACTGCTCAAGGAACCTCTATGTATCCCTCCTGT 1934 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

1 1 1 1 1 1 1 1 MIMIIMM INN I MM II II II II II II II 1 1 II I II 1 1 1 II II 

Db 1935 TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 1994 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 102 0 

M I II II II 1 1 1 1 1 II 1 1 1 II I MM 1 1 II II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 II 1 1 1 M II 

Db 1995 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 2054 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

Ml M II II M 1 1 1 1 1 1 M II III III IMIM II MM II I II II M 1 1 II 1 1 1 II II I 

Db 2055- CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 2114 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM III I I I I I I I I I I I I t I I I I I I I II II I II I I I M I I II I I II I I I I I II II 
Db 2115 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 2174 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

Ml II II I M I II 1 1 II M III II 1 1 II I III III II Ml 

Db 2175 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAACAAA 2215 



RESULT 11 

AB104722 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



AB104722 1170 bp DNA linear VRL 26-JUN-2003 

Hepatitis B virus s gene for pre-S and S protein, complete cds, 
isolate: EG33 . 
AB104722 

AB104722.1 GI:32261204 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae ; 

Orthohepadnavirus . 

1 

Saudy,N., Sugauchi,F., Tanaka,Y., Suzuki, S., Aal , A . A . , Zaid,M.A., 
Agha,S. and Mizokami,M. 

Genotypes and phylogenetic characterization of hepatitis B and 

delta viruses in Egypt 

J. Med. Virol. 70 (4), 529-536 (2003) 

12794714 

2 (bases 1 to 1170) 

Suzuki , S., Saudy,N., Sugauchi,F., Orito,E. # Agha,S. and Mizokami,M. 
Direct Submission 

Submitted (26-FEB-2003) Seiji Suzuki, Nagoya City University 
Graduate School, Department of clinical Molecular Informative 
Medicine; Mizuho, Nagoya, Aichi 467-8601, Japan 
(E-mail : sei j is@med .nagoya- cu . ac . jp, Tel : 81-52 -853-8292, 
Fax:81-52-842-0021) 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 
/mol_type= "genomic DNA" 
/isolate=" EG33" 
/db_xref ="taxon: 10407" 
/ c oun t ry = " Egyp t " 
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/ not e= "genotype D" 
gene 1. .1170 

/gene="s" 
CDS 1. .1170 

/genets" 

/codon_start=l 

/product =" pre -S and S protein" 
/prot ein_id= " BAC7 8517.1" 
/db_xref="GI : 32261205" 

/ 1 rans lat ion= " MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLG FT P PHGGLLGWS PQAQG I LQTL PTN P P P ASTNRQ S GRQ PT PLS P PLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTWPVPTTVSHISSIFSRIGDP 
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQS 
PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 
STTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGNCTCI PI PS S WAFGKFLWE WAS ARFS 
WLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 

ORIGIN 



Query Match 95.3%; Score 1125.2; DB 10; Length 1170; 

Best Local Similarity 97.6%; Pred. No. 0; 

Matches 1142; Conservative 0; Mismatches 28; Indels 0; Gaps 0; 



Qy 


i 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 




I I l I l l l l l l l l l l l I l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l t l l l l l l l l l t l l 

II II II II II II 1 1 1 II II II II II II II 1 II 1 II II II 1 II 1 II 1 1 1 1 II II II II 1 1 1 




Db 


i 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 


Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 




i i i i i i i i i i i i i i i i i i i i i i i t i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i 
1 II 1 II II II 1 11 II II II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 




Db 


61 


CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACT 


120 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 






i i i i i i > i i i i i i i i i i i t i i i > i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 
1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGATTCACCCCACCGCAC 


180 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 






i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i t i i i i i i i i i i i i i 
1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


GGAGGTCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACCTTGCCAACAAAT 


240 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 






llllililll II II II II II 1 1 1 1 1 II III 1 1 1 1 ! II II 1 1 1 1 1 II II II II II II 




Db 


241 


CCGCCTCCTGCCTCTACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






lllllllllll MINIM IMIMMMMMM III 1 IIIIIIIIIIIMI 




Db 


301 


TTGAGAAACACACATCCTCAGGCCATGCAGTGGAACTCCACTACCTTCCACCAAACTCTG 


360 


Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






III Ml III II Mill III II 1 1 1 1 III 1 1 II 1 M 1 1 1 1 1 1 1 M II II M 1 1 1 1 1 M 




Db 


361 


CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






lllllllllll MIMMMMM M 1 1 1 1 1 M M 1 1 1 1 1 M 1 II 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


421 


AACCCTGTTCCGACTACTGTCTCTCACATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






MM 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 II II M 




Db 


481 


GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






1 III 1 Mill 1 II II 1 Ml Mill II II II III 1 II M II II 1 MM II 1 III 1 Ml 1 1 




Db 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 


Qy 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 



I IIIIIIIIIMIIIIIIIIIIII Mllllllllllllll lllllllllllll III MM 
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Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 
661 TCC AATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 720 

1 1 1 ! 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 II ! 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 

661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 720 

721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I 1 I I I I I I I I I ( I I I I I I I M I I I 
721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 



781 



840 



CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill! 

781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCT/^TTCCAGGATCTTCAACTACCAGC 840 



841 



900 



ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 

I II II II II II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I M I I I I 
841 ACGGGACCATGCAGAACCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGC 900 

901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

I II II II I II I Ml III II HIM I II II III III II III II I II II I II I II III I 
901 TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 



III 



I 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 II : II 1 1! II I II II I II II 1 1 1 I I 1 1 1 1 1 



961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 
1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

II 1 1 1 1 1 1 1 1 II II 1 1 II 1 1 II I II II IMM II I II II 1 1 II II 1 1 1 1 II 1 1 1 1 1 1 1 

1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 
1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

1 1 1 1 III I Ml II II II II Ml 1 1 II I MMI III II II II II II II II M II II 1 1 

1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

II 1 II I II I III II I I MM II II MMI 
1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 



RESULT 12 

AY576427 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



AY576427 1170 bp DNA linear VRL 01-APR-2005 

Hepatitis B virus isolate 39 large S protein gene, partial cds . 
AY576427 

AY576427.1 GI:50953650 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; ■ 
Orthohepadnavirus . 

1 (bases 1 to 1170) 

Kew,M.C, Kramvis,A., Yu,M.C. and Hodkinson,J. 

Increased hepatocarcinogenic potential of hepatitis B virus 

genotype A in black Africans 

Unpublished 

2 (bases 1 to 1170) 
Kramvis,A. and Kew,M.C. 
Direct Submission 

Submitted (19-MAR-2004) Medicine (Molecular Hepatology Research 
Unit) , University of the Witwatersrand, 7 York Road, Johannesburg, 
GP 2193, South Africa 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 
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/mol_type= "genomic DNA" 
/ serotype= " ayw2 " 
/isolate="39" 
/db_xref="taxon: 10407" 
/country=" South Africa" 
/note= "genotype : D" 
CDS <1. .1170 

/note=" envelope protein" 
/ codon_s tart = 1 
/product=" large S protein" 
/protein_id= " AAT9 04 07 . 1 " 
/db_xref ="GI : 50953651 " 

/ 1 rans lat ion= " MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGFT P PHGGLLGWS PQAQG I LQTLP ANP P PASTNRQSGRQPT PLS P PLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTASPLSSIFSRIGDP 
ALNMENITSGFLGPLLVLQAGFFLLTRILTI PQSLDSWWTSLNFLGGTTVCLGQNSQS 
PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 
STTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGNCTCI PI PSSWAFGKFLWEWASARFS 
WLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 

ORIGIN 

Query Match 95.3%; Score 1125.2; DB 10; Length 1170; 

Best Local Similarity 97.6%; Pred. No. 0; 

Matches 1142; Conservative 0; Mismatches 28; Indels , 0; Gaps 0; 



Qy 


l 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 






1 1! 1 1 1 1 M II 1 1 1 1 1 1 '1 1 Ml '1 1 II 1. II M 1! II 1 1 II 1 II 1 1 1 1 1 II 1 1 II M 1 1 




Do 


l 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 


Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 






iiiiiiiiiiiiiiiii ill II : 1 1 1 II 1 1 1 II M li I M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


61 


CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 II 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGTTTCACCCCACCGCAC 


180 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 






1 1 1 1 1 M II 1 1 II 1 1 1 1 II II 1 II II IN M II II 1 1 1 llllll llllllllllll 




Db 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACTTTGCCAGCAAAT 


240 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 






lllllllllll 1 1 1 II II II III II II 1 ' II M II II 1 1 II II 1 1 1 1 1 1 II II II II 1 




Db 


'241 


CCGCCTCCTGCCTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






III II II Ml Ml II Mill III II II II II III II MM II- III MUM MUM 




Db 


301 


TTGAGAAACACTCATCCTCAGGCCATGCAGTGGAATTCCACAACCTTCCACCAAACTCTG 


360 


Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllllll 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


361 


CAAGATCCCAGAGTGAGAGGCCTGTATTTTCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II 1 MMMMMMMIMMIMMIMIMM 




Db - 


421 


AACCCTGTTCCGACTACTGCCTCTCCCTTATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






MM 1 1 1 II 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 MIMMMMMM 




Db 


481 


GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTTCTCGTGTTACAGGCG 


540 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






1 llllllll II MM II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 lllllllllll MM Mill MM 




Db 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 
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Qy 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

III II II II I! I! II II II Mill II II II MINI IIMIMI II I III MM II II II 

Db 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

Qy 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 

I I I I I I M I I I I I I II I I I I I I I I I I II I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 

Qy 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

I M II II II M II MM II 1 1 III MIMI II II MM II II II II II 1 1 II 1 1 II I Ml 

Db 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

1 1 1 II M 1 1 II 1 1 1 II M 1 1 M II II I II M 1 1 II II I M I M II I llllllllllll 

Db 781 CTGGACTATCAAGGTATGTTGCCCGTGTGTCCTCTAATTCCAGGATCCTCAACCACCAGC 840 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

llllllllllll II 1 1 M 1 1 M I M I M II I M M Ml I II I II II M I II 1 1 1 M II 

Db 841 ACGGGACCATGCCGAACCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

IIMIMI 1 1 1 1 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 1 1 1 1 1 1 1 1 1 

Db 901 TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

I II 1 1 1 M 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 II II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M M M 1 1 1 1 

Db 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

II I Mil M 1 1 1 M 1 1 1 1 IM II II II II IM 1 1 1 M II 1 1 1 II I II 1 1 1 1 II M 1 1 1 1 

Db 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

1 1 1 1 III II 1 1 1 1 1 1 II II II II II I MM M M M MM M II I M M II II II II 

Db 1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

I II II MM II 1 1 1 1 II I III MIMMII 

Db 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 



RESULT 13 

AY603460 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



AY603460 1170 bp DNA linear VRL 19-APR-2005 

Hepatitis B virus isolate 02T large S protein (S) , middle S protein 
(S) , and S protein (S) genes, complete cds . 
AY603460 

AY603460.1 GI : 47499931 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae ; 
Orthohepadnavirus . 

1 (bases 1 to 1170) 

Sominskaya, I . , Mihailova, M . , Jansons,J., Emelyanova, V. , 
Folkmane, I . , Smagris,E., Dumpis,U., Rozentals,R. and Pumpens,P. 
Hepatitis B and C virus variants in long-term immunosuppressed 
renal transplant patients in Latvia 
Intervirology 48 (2-3), 192-200 (2005) 
15812194 

2 (bases 1 to 1170) 

Sominskaya, I . , Mihailova, M . , Jansons, J. , Emelyanova, V. , 
Folkmane,!., Smagris,E., Dumpis,U., Rozentals,R. and Pumpens,P. 
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TITLE 
JOURNAL 



FEATURES 

source 



gene 



CDS 



CDS 



CDS 



Direct Submission 

Submitted (21-APR-2004 ) Protein Engineering Department, Biomedical 
Research and Study Centre, University of Latvia, Ratsupites str. 1, 
Riga LV-1067, Latvia 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 
/virion 

/mol_type= "genomic DNA" 
/isolate="02T" 

/isolation_source=" renal transplantation patient" 
/ db_x r e f = " t axon : 1 0 4 0 7 " 
/count ry=" Latvia" 

/note=" subtype : ayw2; genotype: D" 

1. .1170 

/gene- 11 S" 

1. .1170 

/gene="S" 

/note="pre-Sl/pre-S2/S; LHBS/MHBS/HBsAg ; surface antigen" 

/codon_start=l 

/product =" large S protein" 

/protein_id="AAT28717 .1" 

/db_xref="GI : 47499932" 

/translation="MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGFT P PHGGLLGWS PQAQG IIQTL PAN P P PAS TNRQSGRQ PTPLSPPLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTAS P I S S I FSRIGDP 
VLNMENITSGFLGPLLVLQAGFFLLTKIFTIPQSLDSWWTSLNFLGGTTVCLGQNSQS 
PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 
STTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGNCTCI PI PSSWAFGKFLWEWASARSS 
WLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 
325. .1170 
/gene="S" 

/note="pre-S2/S; MHBS/HBsAg; surface antigen" 

/codon_start=l 

/product = "middle S protein" 

/protein_id="AAT2 8718 .1" 

/db_xref = "GI : 47499933 " 

/trans la tion= "MQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTASPISS 

IFSRIGDPVLNMENITSGFLGPLLVLQAGFFLLTKIFTIPQSLDSWWTSLNFLGGTTV 

CLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLP 

VC PL I PGS STTSTGPCRTCTT PAQGTSMYPS CCCTKPSDGNCTC I PIPS SWAFGKFLW 

EWASARSSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFC 

LWVYI " 

490. .1170 

/gene="S" 

/note="HBsAg; surface antigen" 
/ codon_s t ar t = 1 
/product="S protein" 
/protein_id= "AAT28719 .1" 
/db_xref="GI : 47499934" 

/ trans lat ion= " MENITSGFLGPLLVLQAGFFLLTKI FTI PQSLDSWWTSLNFLGG 
TTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQG 
MLPVCPLIPGSSTTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGK 
FLWEWASARSSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPI 
FFCLWVYI " 



ORIGIN 



Query Match 95.3%; 
Best Local Similarity 97.6%; 
Matches 1142; Conservative 



Score 1125.2; 
Pred . No . 0 ; 
0; Mismatches 



DB 10; Length 1170; 



28; Indels 



0; Gaps 



0; 



Qy 

Db 



1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 

I II II III 111 II INI II Mill III! Ililll 1 1 II II II II II 1 1 II 1 1 II II III I 

1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 
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Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 




1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM III II II II Mill IMIIMI li II II II II 




UD 


a i 
o± 


PPAPPnTTOHOTlPOR A APAOA/T^A A 7V T 1 /'"*/"' A A 'PT^'VO A /""FT*/"* A AT , /^/^/* 1 AA/ w 'AA^/^A/* 1 A/ w '/* 1 

LLAGLL 1 1 LAGAGLAAALALAGLAAA1 LLAGA 1 1 GGGAL I I LAA I LLLAALAAGGALALL 


120 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 






II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M M 1 M 1 M II II 1 Mill 1 1 II I II 1 1 1 1 1 i II 




nK 
UD 




r m/^ t r , /^7s.r*T\.f n, r*r , r , T\ aoa Af'" i /'" ,r r , A/T" , A/T ir Pf- i /'' i A a n^^vc^cicc^ r^tvcar* T^^T^r^T^ r^r^r^r^T^ r^r^r^r^j^ 
1 GGLLAGALGLLAALAAGG 1 AGGAGL 1 GGAGLA 1 1 LGGGL 1 uuuA 1 1 LALLLLALLGLAL 


XoU 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 




MMIM II M IM 1 MM II II MMMI II II II II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




r»K 


1 Q1 


GGAGGLL 1111 GGGG 1 GGAGLLL 1 LAGGL 1 LAGGGLA 1 AA 1 ALAAALL 1 1 GLLAGLAAA 1 


o a r\ 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 




MIMIIMM II 1 1 M 1 1 1 1 II 1 1 M 1 1 1 ! M 1 1 1 M 1 1 1 1 1 1 1 1 II M M 1 1 M 1 1 




nK 

UD 




LLGLL 1 LL 1 GL Al L 1 ALLAA 1 LGLLAG 1 LAGGAAGGLAGLL 1 ALLLLGL 1 G 1 L 1 LLALL 1 


i n a 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






MMIM M M II M II II 1 IIIIIIMMIIIIIIIIIMM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 


Jul 


TTPAPS A APA /""TV A TPPTPAPPPPATPPAPTPPA A /""TV/"* A PA A ppmrppp* r*r* A A A P^PPTP 
1 1 oAoAAHLAL 1 LA 1 1 CnbuLLAl uLnu 1 uuAAL 1 L LALAAt 1 1 1LLALLAAAL1L 1G 




Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






lllllllllll IMIIMI ! 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 i 1 1 1 i 1 1 




UD 


Jol 


OA A O A TPPP A Z" 1 rp/~i A Z" 1 A PPPprnrirp* TTTPPPTip pmppmpppmprn /"irn rp/~i 7\ /"l A npi\nrmv 

LAAGA1 LLLAGGG1 GAGAGGlCTGTATTTCClTGlTGGTGGlTl 


420 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






1 IM M M 1 1 M 1 II 1 1 1' M M 1 II II II 1 1 1 1 1 1 M M 1 II II M 1 M II 1 M 1 1 1 




UD 


4z 1 


A A OOO r PO r P r POOO A PT A piippi /~i /^i m /"irp nnniv rpATPPTPA A rp /^m rp/""! rp/~i/~i A /~i A frpPPPPJk /-i/->/~irp 

AALLL1G1 lLLGALlALlGLLlLlLLLAlAlLGlLAAlLi 1 L 1 LGAGGATTGGGGACCCT 


48 0 


Qy 


481 


GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 


540 






1 II II M IM 1 IM Mill 1 II IMM Ml M M M II II II MM 1 M III M 1 III 




UD 


f± O X 


PTPPTPS APATPPAPA JPATP APSTPAPPATTPPTAPP A PPPPTPPTPPTP'PTA /"< A /"»/TV 

G1GL1 GAAL A 1 GGAGAAL A 1 LALAiLAvjuAl 1 LL lAUvjALLLLlbL 1 LG1G1 1 ALAGGLG 


A f\ 


Qy 


541 


GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 


600 






llllllll IMMIIII MM 1 II II II II MM M II 1 II M M MM 1 II 1 1 II 




UD 




PPPTTTTTPTTPTTP A P A A A A AT'OT'T'OAOA ATA /~i /~i 7\ /"i 71 /"trp /-ir-p »nj prripprriPPIiPPA nm 

GGG1 1111L11G1 1GALAAAAA1L1 1 LALAAl AllGlAGAGTlTAGAlTCGTGGTGGAlT 


600 


Qy 


601 


TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 


660 






MM IMIII II 1 1 1 II II 1 Ml 1 1 II II II II II II M M 1 1 1 1 1 1 1 1 II 1 1 II II II 




UD 




TPTPTPA A TTTTPT A POP/*"' P A A PT A pppmpmpriiprpnipp 7\ t\ t\ t\ mmppOTi /~irp /~i f-y r~yr~\ A A /^"i 

1 LI LI LAA 1 1 1 lLlAGGGGGAALlALLGlGlGlLriGGLLAAAATTCGCAGTLLLLAALL 


660 


Qy 


661 


TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 


720 






Ml III III 1 1 II 1 1 1 II III II M 1 1 1 IMM 1 1 II II III II II II 1 II 1 II M II 1 1 




UD 


DDI 


mppA ATPAPTPAPPA AOOTOOT'OT'OO'T'OOA A PTTPTPPTPPTrp A fpPPPriipPA rprirpnrpri'Tir' 

1 LLAA 1 LAC 1 LALLAALL 1 LL 1 G 1 LL 1 LLAAL 1 1 Gl LCI GG iTATlGCTGGATGTGTlTG 


720 


Qy 


721 


CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 


780 






MM MMIM II II IMIIMI II III lllllllllll Ml II II IMIIMI MM II 




nK 
UD 


TOT 
I Z 1 


LGGLG1 1 1 1A1LA1L1 iLLlLl 1 LAILL rGLTGL rATGCCTCATCTTCTTGTTGGTTCTT 


780 


Qy 


781 


CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 


840 






1 MM II 1 1 M II 1 1 II 1 1 1 1 II 1 1 II II MM 1 1 1 1 1 IM M 1 M M M II 1 II II II 




nK 
UD 


TOT 

/ ol 


CTGGAlTATCAAGGTATGTTGlCCGTTTGTCCTCTAATTCCAGGATCT 


840 


Qy 


841 


ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 


900 






1 1 1 II III 1 II II 1 1 II III II II 1 1 Mill III II III M MM II 1 1 M II 1 1 II M 




nK 
UD 


Q A 1 


ALGGGALLA 1 GLAGAALL 1 GL ALGAL 1 LL 1 GL 1 LAAGGAAllTlTATGTATlCCTllTGT 


900 


Qy 


901 


TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 


960 






llllllll lllllllllll Mill M 1 1 II 1 1 1 II 1 II 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 II 




Db 


901 


TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 


960 


Qy 


961 


TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 


1020 






Mlllllllllll IIIIIIIIIIIIIIIIIIIIMII MMIMIIMIMIIIIMM 




Db 


961 


TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTCCTCCTGGCTCAGTTTACTAGTG 


1020 
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Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

I Ml 1 1 1 1 1 1 1 1 1 1 1 II !! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml I! 1 1 1 1 

Db 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

Mil III 1 1 1 1 1 1 Ml II 1 1 1 1 1 1 1 ! I Ml I III II III II II II II MINI II II 

Db 1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 



Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 1170 

II Mill lllllllllllllllllllll 
Db 1141 ATTTTCTTCTGTCTTTGGGTATACATTTAA 1170 



RESULT 14 

AY603464 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



AY603464 1170 bp DNA linear VRL 19-APR-2005 

Hepatitis B virus isolate 12T large S protein (S) , middle S protein 
(S) , and S protein (S) genes, complete cds . 
AY603464 

AY603464 . 1 GI: 47499947 

Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; 
Orthohepadnavirus . 

1 (bases 1 to 1170) 

Sominskaya, I . , Mihailova,M. , Jansons,J., Emelyanova, V. , 
Folkmane,I., Smagris,E., Dumpis,U., Rozentals,R. and Pumpens,P. 
Hepatitis B and C virus variants in long-term immunosuppressed 
renal transplant patients in Latvia 
Intervirology 48 (2-3), 192-200 (2005) 
15812194 

2 (bases 1 to 1170) 

Sominskaya, I . , Mihailova, M . , Jansons,J., Emelyanova, V. , 
Folkmane,I., Smagris,E., Dumpis,U., Rozentals,R. and Pumpens,P. 
Direct Submission 

Submitted (21-APR-2 004) Protein Engineering Department, Biomedical 
Research and Study Centre, University of Latvia, Ratsupites str. 1, 
Riga LV-1067, Latvia 

Location/Qualifiers 

1. .1170 

/organism= "Hepatitis B virus" 
/virion 

/mol_type= "genomic DNA" 
/isolate="12T" 

/isolation_source=" renal transplantation patient" 
/ db_xr e f = " t axon i 104 07" 
/count ry= "Latvia" 

/note= " subtype : ayw3 ; genotype: D" 

1. .1170 

/gene="S" 

1. .1170 

/gene="S" 

/note="pre-Sl/pre-S2/S; LHBS/MHBS/HBsAg ; surface antigen" 

/codon_start=l 

/product=" large S protein" 

/protein_id= "AAT28729 .1" 

/db_xref="GI : 47499948" 

/translations "MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 
NKVGAGAFGLGFTPPHGGLLGWSPQAQGIIQTLPANPPPASTNRQSGRQPTPLSPPLR 
NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTVSPISSIFSRIGDP 
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQS 
PTSNHS PTS C PPTC PGYRWMCLRRF 1 1 FLF I LLLCL I FLLVLLDYQGML PVCPL I PGS 
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STTSAGTCRTCTTTAQGTSMYPSCCCTKPSDGNCTCI PI PSSWAFGKFLWEWASARFS 
WLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 
CDS 325. .1170 

/gene="S" 

/note="pre-S2/S; MHBS/HBsAg; surface antigen" 

/codon_start=l 

/product^ "middle S protein" 

/protein_id= "AAT28730 .1" 

/db_xref ="GI :47499949" 

/ 1 rans la t ion= "MQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTWPVPTTVS PI SS 
IFSRIGDPALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTV 
CLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLP 
VCPLIPGSSTTSAGTCRTCTTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLW 
EWASARFSWLSLLVPFVQWFVGLSPTWLSVIWMMWYWGPSLYSILSPFLPLLPIFFC 
LWVYI" 

CDS 490. .1170 

/gene="S" 

/note="HBsAg; surface antigen" 
/ codon_s t a r t = 1 
/product="S protein" 
/protein_id= "AAT28731 .1" 
/db_xref ="GI :47499950" 

/translation= "MENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGG 
TTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQG 
MLPVCPLIPGSSTTSAGTCRTCTTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGK 
FLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPI 
FFCLWVYI " 

ORIGIN 



Query Match 95.3%; Score 1125.2; DB 10; Length 1170; 

Best Local Similarity 97.6%; Pred. No. 0; 

Matches 1142; Conservative 0; Mismatches 28; Indels 0; Gaps 0; 



Qy 


1 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 






1 1 II 1! 1 1 ! 1 II 1 1 1 II Ml 1 1 1 II 1 1 II II 1 1 1 1 1 1 II 1 1 1 . 1 1 II 1 1 1 1 II ! 1 1 1 II 




Db 


1 


ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 


60 


Qy 


61 


CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 






1 1 1 II II 1 1 1 1 II 1 1 1 II 1 1 1 IIMMIIIIIMIIIMI MIMIIMI MM 




Db 


61 


CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 


120 


Qy 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 


180 






1 II MM 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 1 1 Mill lllllllll Mill 




Db 


121 


TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGATTCACCCCACCGCAC 


180 


Qy 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 


240 






1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 M 1 11 1 1 1 IIMM IMIMMIMI 




Db 


181 


GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATCATACAAACTTTGCCAGCAAAT 


240 


Qy 


241 


CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 






lllllllllll 1 1 1 1 1 1 1 M M M 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 M II 1 II 1 M II II 1 




Db 


241 


CCGCCTCCTGCATCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 


300 


Qy 


301 


TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 


360 






III II II 1 1 1 1 II 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II II 1 1 1 1 1 1 1 1 1 




Db 


301 


TTGAGAAACACTCATCCTCAGGCCATGCAGTGGAACTCCACAACCTTCCACCAAACTCTG 


360 


Qy 


361 


CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 






lllllllllll MINIM M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 M 1 II M 




Db 


361 


CAAGATCCCAGGGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 


420 


Qy 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 






M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 II M 1 1 1 1 




Db 


421 


AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 


480 
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Qy 481 GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 540 

Mil I I I I I I I i I I I I I I I I I I I I I I I I I I I M I I I I I I i! I I I I I I I I I! I I I I I I II 
Db 481 GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 540 

Qy 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

I II I II I II 1 1 1 1 II 1 1 1 1 li Ml II 1 1 II II 1 1 1 1 II MM II II 1 1 1 1 II MM 1 1 1 1 

Db 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

Qy 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

Ml MM M MM II I Ml II II III II MM II II MM II II II 1 1 1 1 II M MM M 

Db 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

Qy 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 720 

M I II M M MM II Ml I II Mill II MM II II II II MM Ml 1 1 Ml II II MM 

Db 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 

Qy 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

M I II I M 1 1 M I M M M 1 1 M II 1 1 1 1 1 II II I M 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 

Db 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

I M II II 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 III II! II Ml II II II I III 1 1 II 1 1 II II II 

Db 781 CTGGACTATCAAGGTATGTTGCCCGTCTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 84 0 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

Mill llllllll I I I I I ! I I I I I II II II I II II II I II I II II I II I I I II II 
Db 841 GCGGGAACATGCAGAACCTGCACGACTACTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

llllllll I I i I I I I I I I I Mill II I II M I I II II II I II I I I II I I I II I III I 
Db 901 TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

I II II 1 1 II M II 1 1 II M 1 1 III MIMI II II I II I II 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 II I 

Db 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

II III I II 1 1 II I II I II 1 1 III II I II Mill MM II I II III I Ml 1 1 MM II II 

Db 1021 CCCTTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

MM Ml II 1 1 1 1 II II II I II III I MM 1 1 III 1 1 1 1 1 1 1 1 1 1 II II I MIMI 

Db 1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTCTTACCA 1140 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAA 117 0 

llllllll II II I I II II II Mill I I I I 
Db 1141 ATTTTCTTCTGTCTTTGGGTATACATTTAA 117 0 



RESULT 15 

AY230128 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 



DNA linear VRL 23-NOV-2003 

non- tumor large surface protein, 
surface protein genes, complete 



AY230128 1186 bp 

Hepatitis B virus isolate case 24 
middle surface protein, and small 
cds . 

AY230128 

AY23012 8 .1 GI : 38374285 



Hepatitis B virus 
Hepatitis B virus 

Viruses; Retro- transcribing viruses; Hepadnaviridae; 
Orthohepadnavirus . 
1 (bases 1 to 1186) 
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AUTHORS 
TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



CDS 



CDS 



Raimondo,G. , Pollicino,T. and Raffa,G. 

Occult HBV in liver cancer 

Unpublished 

2 (bases 1 to. 1186) 

Raimondo,G. , Pollicino,T. and Raffa,G. 
Direct Submission 

Submitted (04-FEB-2003) Internal Medicine, University of Messina, 
via Consolare Valeria, Messina 98124, Italy 

Location/Qualifiers 

1. .1186 

/organism= "Hepatitis B virus" 
/virion 

/mol_type= "genomic DNA" 
/isolate="case 24 non-tumor" 
/ db_xr e f = " t axon : 1 0 4 0 7 " 
1. .1170 

/note="preSl; preS/S" 
/codon_start=l 

/product =" large surface protein" 
/protein_id="AAR19340 .1" 
/db_xref="GI : 38374286" 

/ 1 rans 1 at ion= " MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDA 

NKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPANPPPASTNRQSGRQPTPLSPPLR 

NTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVLTTASPLSSIFSRIGDP 

ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQCLDSWWTSLNFLGGTTVCLGQNSQS 

PTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGS 

STTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLWEWASARFS 

WLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI" 

325. .1170 

/note="preS2/S n 

/codon_start=l 

/product= "middle surface protein" 
/protein_id="AAR19341. 1" 
/db_xref="GI : 38374287" 

/ translation "MQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVLTTASPLSS 

IFSRIGDPALNMENITSGFLGPLLVLQAGFFLLTRILTIPQCLDSWWTSLNFLGGTTV 

CLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLP 

VCPLIPGSSTTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLW 

EWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFC 

LWVYI" 

490. .1170 

/note="S" 

/ codon_s t art = 1 

/product = " small surface protein" 
/protein_id= " AAR19342 . 1 " 
/db_xref="GI : 38374288" 

/ trans la t ion= " MENITSGFLGPLLVLQAGFFLLTRI LTI PQCLDSWWTSLNFLGG 
TTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQG 
MLPVCPLIPGSSTTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGK 
FLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPI 
FFCLWVYI " 



ORIGIN 



Query Match 95.3%; 
Best Local Similarity 97.0%; 
Matches 1146; Conservative 



Score 1125; DB 10; Length 1186; 
Pred. No. 0; 
0; Mismatches 35; Indels 0; Gaps 



0; 



Qy 

Db 



1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 

I I II I I II Ml Ml I II II I Ml II I I I I I II M I MM III II I I II III I Mill II! 
1 ATGGGGCAGAATCTTTCCACCAGCAATCCTCTGGGATTCTTTCCCGACCACCAGTTGGAT 60 



Qy 

Db 



61 CCAGCCTTCAGAGCAAACACCAACAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 12 0 

Mlilllllllllllllllll II I I I I II I II I Ml Ml II II II II I I II I I I Ml 
61 CCAGCCTTCAGAGCAAACACCGCAAATCCAGATTGGGACTTCAATCCCAACAAGGACACC 120 
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Qy 121 TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGACTGGGGTTCACCCCACCGCAC 180 

1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 ! I Mill 1 1 1 1 1 1 1 1 1 1 1 MM 

Db 121 TGGCCAGACGCCAACAAGGTAGGAGCTGGAGCATTCGGGCTGGGTTTCACCCCACCGCAC 180 

Qy 181 GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATAACACAAACCTTGCCAGCAAAT 240 

IIIIIIIIIIIIIMIIIMIIIIIIIIIMIIIIIIII I II I I I II I I I I I I I I I I 
Db 181 GGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAGGGCATACTACAAACTTTGCCAGCAAAT 24 0 

Qy 241 CCGCCTCCTGCTTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 300 

I MM II MM I II MM MIIMI MM II III Mill II II MM 1 1 II I II III II 

Db 241 CCGCCTCCTGCCTCCACCAATCGCCAGTCAGGAAGGCAGCCTACCCCGCTGTCTCCACCT 300 

Qy 301 TTGAGAAACACTCATCCTCAAGCCATGCAGTGGAACTCCACAACTTTCCACCAAACTCTG 360 

IMMMMIMM Mill MIMMMIMM llllllll I I I I f I I I I I I I 1 I I 
Db 301 TTGAGAAACACTCACCCTCAGGCCATGCAGTGGAATTCCACAACCTTCCACCAAACTCTG 360 

Qy 361 CAAGATCCCAGAGTGAGAGGTCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 420 

MIMMMMIMMIMI MIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIMIII 

Db 361 CAAGATCCCAGAGTGAGAGGCCTGTATTTCCCTGCTGGTGGCTCCAGTTCAGGAACAGTA 420 

Qy 421 AACCCTGTTCCGACTACTGTCTCTCCCATATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

IMIIIMM MMMM MIIMI IIIIMIIIMIIIIIIIIIIIMMIIIIM 

Db 421 AACCCTGTTCTGACTACTGCCTCTCCCTTATCGTCAATCTTCTCGAGGATTGGGGACCCT 480 

Qy 4 81 GCGCGGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCG 540 

MM 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 481 GCGCTGAACATGGAGAACATCACATCAGGATTCCTAGGACCCCTTCTCGTGTTACAGGCG 54 0 

Qy 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT 600 

IMMMMMMMMMMMMMIMMMIMM IIIIIIIIIIIIIIIIIMI 

Db 541 GGGTTTTTCTTGTTGACAAGAATCCTCACAATACCGCAGTGTCTAGACTCGTGGTGGACT 600 

Qy 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

IMIMIIIIMIIIIIIIIIIMIIIIIIIIIIIIIIIIMMIIMIIIIIIIIIMI 

Db 601 TCTCTCAATTTTCTAGGGGGAACTACCGTGTGTCTTGGCCAAAATTCGCAGTCCCCAACC 660 

Qy 661 TCCAATCACTCACCAACCTCCTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 720 

MIMMMIMMMMM IIIIIIIIIMMIIIIIIIIIIMIIIIIIIIIIIMI 

Db 661 TCCAATCACTCACCAACCTCTTGTCCTCCAACTTGTCCTGGTTATCGCTGGATGTGTCTG 72 0 

Qy 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 78 0 

III MM II Mill II Ml II Mill MMIIIMIIII III II II II Ml MIIMI II 

Db 721 CGGCGTTTTATCATCTTCCTCTTCATCCTGCTGCTATGCCTCATCTTCTTGTTGGTTCTT 780 

Qy 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCTTCAACCACCAGC 840 

IMIMMIIIIIIIIMIIIIIIMIIIIIIIIIIIIIIIIIMII Mill MUM 

Db 781 CTGGACTATCAAGGTATGTTGCCCGTTTGTCCTCTAATTCCAGGATCCTCAACAACCAGC 840 

Qy 841 ACGGGACCATGCAGAGCCTGCACGACTCCTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

MMMM M I MIMI MM 1 1 1 1 1 II I II 1 1 1 1 II M II 1 1 1 1 1 II I II 1 1 

Db 841 ACGGGACCATGCCGGACCTGCATGACTACTGCTCAAGGAACCTCTATGTATCCCTCCTGT 900 

Qy 901 TGCTGTACAAAACCTTCGGATGGAAACTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

llllllll MIMMIMI Mill MMMMMMMIMMIIMIMMIMM 

Db 901 TGCTGTACCAAACCTTCGGACGGAAATTGCACCTGTATTCCCATCCCATCATCCTGGGCT 960 

Qy 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

III II II III II MIIMI II II Mill II II II MIIMI I II MIMI I MM Mill 

Db 961 TTCGGAAAATTCCTATGGGAGTGGGCCTCAGCCCGTTTCTCCTGGCTCAGTTTACTAGTG 1020 

Qy 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 

III II II II II II MIMI II III MIMI II MIMMMIMM MM Mill Mill 

Db 1021 CCATTTGTTCAGTGGTTCGTAGGGCTTTCCCCCACTGTTTGGCTTTCAGTTATATGGATG 1080 
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Qy 1081 ATGTTGTACTGGGGGCCAAGTCTGTACACCATCTTGAGTCCCTTTTTACCGCTGTTACCA 1140 

I I I I III I I I I I I I i I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1081 ATGTGGTATTGGGGGCCAAGTCTGTACAGCATCTTGAGTCCCTTTTTACCGCTGTTACCA 114 0 

Qy 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAATAAA 1181 

i I 1 I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I Ml 
Db 1141 ATTTTCTTTTGTCTTTGGGTATACATTTAAACCCTAACAAA 1181 
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